This file (hdat9600_final_assignment.Rmd) is the R Markdown document in which you need to complete your HDAT9600 final assignment. This assignment is assessed and will count for 30% of the total course marks. The assignment comprises two tasks worth 15 marks each. The first task will focus on logistic regression, and the second task will focus on survival analysis. There is no word limit, but a report of about 10 pages in length when printed (except that it will not be printed) is appropriate.
Don’t hesitate to ask the course convenor for help via OpenLearning. The course instructor are happy to point you in the right direction and to make suggestions, but they won’t, of course, complete your assignments for you!
The data used for this assignment consist of records from Intensive Care Unit (ICU) hospital stays in the USA. All patients were adults who were admitted for a wide variety of reasons. ICU stays of less than 48 hours have been excluded.
The source data for the assignment are data made freely available for the 2012 MIT PhysioNet/Computing for Cardiology Challenge. Details are provided here. Training Set A data have been used. The original data has been modified and assembled to suit the purpose of this assignment. While not required for the purposes of this assignment, full details of the preparatory work can be found in the hdat9600_final_assignment_data_preparation file.
The dataframe consists of 120 variables, which are defined as follows:
Use the hyperlinks below to find out more about the clinical meaning of each variable. The first two clinical variables are summary scores that are used to assess patient condition and risk.
The following 36 clinical measures were assessed at multiple timepoints during each patient’s ICU stay. For each of the 36 clinical measures, you are given 3 summary variables: a) The minimum value during the first 24 hours in ICU (_min), b) The maximum value during the first 24 hours in ICU (_max), and c) The difference between the mean and the most extreme values during the first 24 hours in ICU (_diff). For example, for the clinical measure Cholesterol, these three variables are labelled ‘Cholesterol_min’, ‘Cholesterol_max’, and ‘Cholesterol_diff’.
The data frame can be loaded with the following code:
# Getting the path of your current open file
# Extra code to ensure this file imports birth.csv in local directory for everyone
library(rstudioapi)
current_path <- rstudioapi::getActiveDocumentContext()$path
setwd(dirname(current_path ))
icu_patients_df0 <- readRDS("icu_patients_df0.rds")
icu_patients_df1 <- readRDS("icu_patients_df1.rds")
Note: icu_patients_df1 is an imputed (i.e. missing values are ‘derived’) version of icu_patients_df0. This assignment does not concern the methods used for imputation.
In this task, you are required to develop a logistic regression model using the icu_patients_df1 data set which adequately explains or predicts the in_hospital_death variable as the outcome using a subset of the available predictor variables. You should fit a series of models, evaluating each one, before you present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge. It is perfectly acceptable to include predictor variables in your final model which are not statistically significant, as long as you justify their inclusion on medical or physiological grounds (you will not be marked down if your medical justification is not exactly correct or complete, but do you best). Aim for between five and ten predictor variables (slightly more or fewer is OK). You should assess each model you consider for goodness of fit and other relevant statistics to help you choose between them. For your final model, present a set of diagnostic statistics and/or charts and comment on them. You don’t need to do an exhaustive exploratory data analysis of all the variables in the data set, but you should examine those variables that you use in your model. Finally, re-fit your final model to the unimputed data frame (icu_patients_df0.rds) and comment on any differences you find compared to the same model fitted to the imputed data.
summary(icu_patients_df1)
## RecordID Length_of_stay SAPS1 SOFA
## Min. :132539 Min. : -1.00 Min. : 1.00 Min. :-1.000
## 1st Qu.:133875 1st Qu.: 6.00 1st Qu.:11.00 1st Qu.: 3.000
## Median :135146 Median : 10.00 Median :15.00 Median : 6.000
## Mean :135156 Mean : 13.74 Mean :14.96 Mean : 6.441
## 3rd Qu.:136477 3rd Qu.: 17.00 3rd Qu.:19.00 3rd Qu.: 9.000
## Max. :137740 Max. :154.00 Max. :34.00 Max. :22.000
## NA's :96
## Survival in_hospital_death Days Status
## Min. : 0.0 Min. :0.0000 Min. : 0 Mode :logical
## 1st Qu.: 10.0 1st Qu.:0.0000 1st Qu.: 265 FALSE:1288
## Median : 68.0 Median :0.0000 Median :2408 TRUE :773
## Mean : 343.1 Mean :0.1441 Mean :1634
## 3rd Qu.: 420.0 3rd Qu.:0.0000 3rd Qu.:2408
## Max. :2408.0 Max. :1.0000 Max. :2408
## NA's :1288
## Age Albumin_diff Albumin_max Albumin_min
## Min. :16.00 Min. :0.01866 Min. :1.100 Min. :1.100
## 1st Qu.:52.00 1st Qu.:0.28134 1st Qu.:2.600 1st Qu.:2.600
## Median :67.00 Median :0.48134 Median :3.000 Median :3.000
## Mean :64.41 Mean :0.56829 Mean :3.045 Mean :3.012
## 3rd Qu.:78.00 3rd Qu.:0.81866 3rd Qu.:3.500 3rd Qu.:3.500
## Max. :90.00 Max. :2.31866 Max. :5.300 Max. :5.300
##
## ALP_diff ALP_max ALP_min ALT_diff
## Min. : 0.148 Min. : 19.0 Min. : 19.0 Min. : 0.446
## 1st Qu.: 21.852 1st Qu.: 57.0 1st Qu.: 58.0 1st Qu.: 89.446
## Median : 37.852 Median : 78.0 Median : 76.0 Median : 102.446
## Mean : 56.259 Mean : 105.7 Mean : 101.4 Mean : 154.873
## 3rd Qu.: 54.852 3rd Qu.: 110.0 3rd Qu.: 105.0 3rd Qu.: 108.446
## Max. :1408.148 Max. :1504.0 Max. :1339.0 Max. :10319.554
##
## ALT_max ALT_min AST_diff AST_max
## Min. : 3.0 Min. : 1.0 Min. : 0.647 Min. : 5.0
## 1st Qu.: 17.0 1st Qu.: 17.0 1st Qu.: 123.353 1st Qu.: 27.0
## Median : 30.0 Median : 30.0 Median : 142.353 Median : 51.0
## Mean : 118.3 Mean : 90.1 Mean : 227.991 Mean : 188.1
## 3rd Qu.: 69.0 3rd Qu.: 69.0 3rd Qu.: 152.353 3rd Qu.: 130.0
## Max. :10440.0 Max. :9240.0 Max. :15870.647 Max. :16040.0
##
## AST_min Bilirubin_diff Bilirubin_max Bilirubin_min
## Min. : 5.0 Min. : 0.03596 Min. : 0.100 Min. : 0.100
## 1st Qu.: 24.0 1st Qu.: 1.06404 1st Qu.: 0.400 1st Qu.: 0.400
## Median : 42.0 Median : 1.36404 Median : 0.700 Median : 0.600
## Mean : 116.4 Mean : 1.97637 Mean : 1.739 Mean : 1.568
## 3rd Qu.: 87.0 3rd Qu.: 1.46404 3rd Qu.: 1.300 3rd Qu.: 1.100
## Max. :7960.0 Max. :44.13596 Max. :45.900 Max. :45.500
##
## BUN_diff BUN_max BUN_min Cholesterol_diff
## Min. : 0.4729 Min. : 3.00 Min. : 2.00 Min. : 0.5772
## 1st Qu.: 7.4729 1st Qu.: 14.00 1st Qu.: 12.00 1st Qu.: 17.5772
## Median : 11.5270 Median : 20.00 Median : 18.00 Median : 34.4228
## Mean : 15.7904 Mean : 27.48 Mean : 24.44 Mean : 37.2723
## 3rd Qu.: 16.5270 3rd Qu.: 33.00 3rd Qu.: 29.00 3rd Qu.: 55.4228
## Max. :172.4729 Max. :197.00 Max. :157.00 Max. :173.5772
##
## Cholesterol_max Cholesterol_min Creatinine_diff Creatinine_max
## Min. : 59.0 Min. : 59 Min. : 0.03245 Min. : 0.200
## 1st Qu.:122.0 1st Qu.:121 1st Qu.: 0.33245 1st Qu.: 0.800
## Median :152.0 Median :152 Median : 0.53245 Median : 1.000
## Mean :153.4 Mean :153 Mean : 0.86298 Mean : 1.499
## 3rd Qu.:181.0 3rd Qu.:179 3rd Qu.: 0.73245 3rd Qu.: 1.500
## Max. :330.0 Max. :330 Max. :20.76755 Max. :22.000
##
## Creatinine_min DiasABP_diff DiasABP_max DiasABP_min
## Min. : 0.200 Min. : 0.5442 Min. : 22.00 Min. : 2.00
## 1st Qu.: 0.700 1st Qu.: 16.5442 1st Qu.: 68.00 1st Qu.: 40.00
## Median : 0.900 Median : 21.5442 Median : 77.00 Median : 46.00
## Mean : 1.319 Mean : 24.5299 Mean : 78.24 Mean : 46.56
## 3rd Qu.: 1.300 3rd Qu.: 28.4558 3rd Qu.: 86.00 3rd Qu.: 52.00
## Max. :14.100 Max. :209.4558 Max. :268.00 Max. :258.00
## NA's :715 NA's :715 NA's :715
## FiO2_diff FiO2_max FiO2_min GCS_diff
## Min. :0.00192 Min. :0.2800 Min. :0.2800 Min. :0.244
## 1st Qu.:0.15192 1st Qu.:0.5000 1st Qu.:0.4000 1st Qu.:3.756
## Median :0.44808 Median :1.0000 Median :0.4000 Median :3.756
## Mean :0.31376 Mean :0.7874 Mean :0.4863 Mean :5.183
## 3rd Qu.:0.44808 3rd Qu.:1.0000 3rd Qu.:0.5000 3rd Qu.:8.244
## Max. :0.44808 Max. :1.0000 Max. :1.0000 Max. :8.244
##
## GCS_max GCS_min Gender Glucose_diff
## Min. : 3.00 Min. : 3.000 Female: 913 Min. : 0.1445
## 1st Qu.:11.00 1st Qu.: 3.000 Male :1148 1st Qu.: 23.8555
## Median :15.00 Median : 8.000 Median : 39.1445
## Mean :12.87 Mean : 8.773 Mean : 57.0844
## 3rd Qu.:15.00 3rd Qu.:14.000 3rd Qu.: 61.8555
## Max. :15.00 Max. :15.000 Max. :1003.1445
##
## Glucose_max Glucose_min HCO3_diff HCO3_max
## Min. : 39.0 Min. : 24.0 Min. : 0.2275 Min. : 9.00
## 1st Qu.: 117.0 1st Qu.: 98.0 1st Qu.: 1.7725 1st Qu.:22.00
## Median : 141.0 Median :117.0 Median : 3.2275 Median :24.00
## Mean : 163.3 Mean :124.8 Mean : 4.1506 Mean :24.27
## 3rd Qu.: 180.0 3rd Qu.:141.0 3rd Qu.: 5.2275 3rd Qu.:27.00
## Max. :1143.0 Max. :632.0 Max. :24.2275 Max. :47.00
##
## HCO3_min HCT_diff HCT_max HCT_min
## Min. : 5.00 Min. : 0.06013 Min. :21.20 Min. : 9.00
## 1st Qu.:20.00 1st Qu.: 2.96013 1st Qu.:30.00 1st Qu.:26.20
## Median :23.00 Median : 5.16013 Median :33.10 Median :29.60
## Mean :22.43 Mean : 5.70366 Mean :33.57 Mean :30.08
## 3rd Qu.:25.00 3rd Qu.: 7.66013 3rd Qu.:36.70 3rd Qu.:33.70
## Max. :44.00 Max. :23.43987 Max. :54.40 Max. :50.60
##
## Height HR_diff HR_max HR_min
## Min. : 13.0 Min. : 0.9221 Min. : 44.0 Min. : 0.00
## 1st Qu.:162.6 1st Qu.: 20.0779 1st Qu.: 91.0 1st Qu.: 61.00
## Median :170.2 Median : 27.0779 Median :104.0 Median : 71.00
## Mean :170.0 Mean : 30.4294 Mean :106.6 Mean : 71.99
## 3rd Qu.:177.8 3rd Qu.: 36.9221 3rd Qu.:119.0 3rd Qu.: 81.00
## Max. :426.7 Max. :212.9221 Max. :300.0 Max. :126.00
## NA's :992
## ICUType K_diff K_max
## Coronary Care Unit :297 Min. : 0.03521 Min. : 2.500
## Cardiac Surgery Recovery Unit:448 1st Qu.: 0.33521 1st Qu.: 4.000
## Medical ICU :788 Median : 0.56479 Median : 4.300
## Surgical ICU :528 Mean : 0.69010 Mean : 4.419
## 3rd Qu.: 0.86479 3rd Qu.: 4.700
## Max. :18.76479 Max. :22.900
##
## K_min Lactate_diff Lactate_max Lactate_min
## Min. :1.80 Min. : 0.003596 Min. : 0.400 Min. : 0.300
## 1st Qu.:3.50 1st Qu.: 1.096404 1st Qu.: 1.500 1st Qu.: 1.200
## Median :3.90 Median : 1.503596 Median : 2.200 Median : 1.600
## Mean :3.95 Mean : 1.753380 Mean : 2.773 Mean : 1.899
## 3rd Qu.:4.30 3rd Qu.: 1.896404 3rd Qu.: 3.200 3rd Qu.: 2.200
## Max. :6.90 Max. :26.503596 Max. :29.300 Max. :24.200
##
## MAP_diff MAP_max MAP_min Mg_diff
## Min. : 0.2316 Min. : 4.0 Min. : 1.00 Min. :0.0157
## 1st Qu.: 21.7684 1st Qu.: 94.0 1st Qu.: 55.00 1st Qu.:0.1843
## Median : 29.2316 Median :104.0 Median : 61.00 Median :0.3157
## Mean : 38.4735 Mean :111.8 Mean : 62.76 Mean :0.4181
## 3rd Qu.: 41.2316 3rd Qu.:117.0 3rd Qu.: 70.00 3rd Qu.:0.5843
## Max. :213.2316 Max. :291.0 Max. :265.00 Max. :7.9157
##
## Mg_max Mg_min Na_diff Na_max
## Min. :1.100 Min. :0.600 Min. : 0.2066 Min. :112.0
## 1st Qu.:1.900 1st Qu.:1.600 1st Qu.: 1.7934 1st Qu.:137.0
## Median :2.100 Median :1.800 Median : 3.2066 Median :140.0
## Mean :2.153 Mean :1.857 Mean : 4.1146 Mean :139.8
## 3rd Qu.:2.400 3rd Qu.:2.100 3rd Qu.: 5.2066 3rd Qu.:142.0
## Max. :9.900 Max. :6.200 Max. :41.2066 Max. :177.0
##
## Na_min NIDiasABP_diff NIDiasABP_max NIDiasABP_min
## Min. : 98 Min. : 0.491 Min. : 29.00 Min. :10.00
## 1st Qu.:136 1st Qu.: 17.509 1st Qu.: 64.00 1st Qu.:33.00
## Median :138 Median : 25.500 Median : 76.00 Median :42.00
## Mean :138 Mean : 26.964 Mean : 76.92 Mean :43.17
## 3rd Qu.:141 3rd Qu.: 33.509 3rd Qu.: 89.00 3rd Qu.:52.00
## Max. :160 Max. :116.509 Max. :174.00 Max. :97.00
## NA's :455 NA's :455 NA's :455
## NIMAP_diff NIMAP_max NIMAP_min NISysABP_diff
## Min. : 0.0407 Min. : 47.33 Min. : 7.00 Min. : 0.3013
## 1st Qu.: 18.2893 1st Qu.: 81.08 1st Qu.: 52.33 1st Qu.: 25.6987
## Median : 24.7107 Median : 93.67 Median : 60.00 Median : 34.3013
## Mean : 26.9759 Mean : 94.47 Mean : 61.69 Mean : 37.7962
## 3rd Qu.: 33.2893 3rd Qu.:106.00 3rd Qu.: 70.00 3rd Qu.: 45.6987
## Max. :113.2893 Max. :189.00 Max. :121.00 Max. :157.3013
## NA's :455 NA's :455 NA's :455 NA's :453
## NISysABP_max NISysABP_min PaCO2_diff PaCO2_max
## Min. : 78.0 Min. : 4.00 Min. : 0.3358 Min. :16.00
## 1st Qu.:121.0 1st Qu.: 83.00 1st Qu.: 5.6642 1st Qu.:39.00
## Median :138.0 Median : 95.00 Median : 8.6642 Median :44.00
## Mean :140.5 Mean : 96.55 Mean :10.7463 Mean :45.56
## 3rd Qu.:156.0 3rd Qu.:108.00 3rd Qu.:13.3358 3rd Qu.:50.00
## Max. :274.0 Max. :234.00 Max. :57.6642 Max. :98.00
## NA's :453 NA's :453
## PaCO2_min PaO2_diff PaO2_max PaO2_min
## Min. : 0.30 Min. : 0.6179 Min. : 27.0 Min. : 20.0
## 1st Qu.:32.00 1st Qu.: 67.6179 1st Qu.:123.0 1st Qu.: 74.0
## Median :36.00 Median : 90.6179 Median :191.0 Median : 92.0
## Mean :36.72 Mean :119.5407 Mean :223.5 Mean :105.8
## 3rd Qu.:40.00 3rd Qu.:154.3821 3rd Qu.:311.0 3rd Qu.:122.0
## Max. :93.00 Max. :341.3821 Max. :500.0 Max. :477.0
##
## pH_diff pH_max pH_min Platelets_diff
## Min. :0.000114 Min. :7.150 Min. :3.000 Min. : 0.2307
## 1st Qu.:0.059886 1st Qu.:7.380 1st Qu.:7.280 1st Qu.: 39.7693
## Median :0.089886 Median :7.420 Median :7.340 Median : 72.7693
## Mean :0.098486 Mean :7.418 Mean :7.327 Mean : 92.5348
## 3rd Qu.:0.120114 3rd Qu.:7.460 3rd Qu.:7.390 3rd Qu.:116.7693
## Max. :4.369886 Max. :7.690 Max. :7.630 Max. :857.2307
##
## Platelets_max Platelets_min RespRate_diff RespRate_max
## Min. : 18.0 Min. : 9.0 Min. : 0.6514 Min. :13.00
## 1st Qu.: 157.0 1st Qu.:126.0 1st Qu.: 7.3486 1st Qu.:24.00
## Median : 210.0 Median :184.0 Median : 9.6514 Median :27.00
## Mean : 228.9 Mean :197.9 Mean :11.6075 Mean :29.12
## 3rd Qu.: 275.0 3rd Qu.:246.0 3rd Qu.:13.6514 3rd Qu.:33.00
## Max. :1047.0 Max. :891.0 Max. :78.6514 Max. :98.00
##
## RespRate_min SaO2_diff SaO2_max SaO2_min
## Min. : 4.00 Min. : 0.2461 Min. : 75.00 Min. : 33.00
## 1st Qu.:12.00 1st Qu.: 0.7539 1st Qu.: 97.00 1st Qu.: 95.00
## Median :14.00 Median : 1.7539 Median : 98.00 Median : 97.00
## Mean :14.25 Mean : 2.5635 Mean : 97.44 Mean : 95.85
## 3rd Qu.:17.00 3rd Qu.: 3.2461 3rd Qu.: 99.00 3rd Qu.: 98.00
## Max. :24.00 Max. :64.2461 Max. :100.00 Max. :100.00
##
## SysABP_diff SysABP_max SysABP_min Temp_diff
## Min. : 3.689 Min. : 52.0 Min. : 11.00 Min. : 0.1259
## 1st Qu.: 32.310 1st Qu.:135.0 1st Qu.: 79.00 1st Qu.: 0.8741
## Median : 40.690 Median :149.0 Median : 88.00 Median : 1.2741
## Mean : 45.008 Mean :152.1 Mean : 90.91 Mean : 1.3756
## 3rd Qu.: 53.690 3rd Qu.:167.0 3rd Qu.:102.00 3rd Qu.: 1.7259
## Max. :178.690 Max. :295.0 Max. :262.00 Max. :12.7741
## NA's :715 NA's :715 NA's :715
## Temp_max Temp_min TroponinI_diff TroponinI_max
## Min. :35.40 Min. :24.20 Min. : 0.1571 Min. : 0.30
## 1st Qu.:37.10 1st Qu.:35.60 1st Qu.: 4.6429 1st Qu.: 2.60
## Median :37.60 Median :36.10 Median : 5.2571 Median : 7.80
## Mean :37.69 Mean :36.01 Mean :10.1737 Mean :11.83
## 3rd Qu.:38.20 3rd Qu.:36.60 3rd Qu.:12.1571 3rd Qu.:17.60
## Max. :42.10 Max. :38.30 Max. :37.9571 Max. :43.40
##
## TroponinI_min TroponinT_diff TroponinT_max TroponinT_min
## Min. : 0.30 Min. : 0.0215 Min. : 0.0100 Min. : 0.0100
## 1st Qu.: 1.30 1st Qu.: 0.5785 1st Qu.: 0.0600 1st Qu.: 0.0400
## Median : 6.80 Median : 0.6285 Median : 0.1700 Median : 0.1200
## Mean :10.06 Mean : 1.0920 Mean : 0.9079 Mean : 0.6347
## 3rd Qu.:13.20 3rd Qu.: 0.6585 3rd Qu.: 0.8000 3rd Qu.: 0.4700
## Max. :42.90 Max. :23.7915 Max. :24.4600 Max. :22.9300
##
## Urine_diff Urine_max Urine_min WBC_diff
## Min. : 19.22 Min. : 0.0 Min. : 0.00 Min. : 0.03315
## 1st Qu.: 100.78 1st Qu.: 200.0 1st Qu.: 0.00 1st Qu.: 2.63315
## Median : 300.78 Median : 400.0 Median : 20.00 Median : 4.53315
## Mean : 438.25 Mean : 521.8 Mean : 34.55 Mean : 5.82079
## 3rd Qu.: 525.78 3rd Qu.: 625.0 3rd Qu.: 36.00 3rd Qu.: 7.23315
## Max. :4900.78 Max. :5000.0 Max. :600.00 Max. :143.46685
##
## WBC_max WBC_min Weight_diff Weight_max
## Min. : 0.10 Min. : 0.10 Min. : 0.00012 Min. : 34.60
## 1st Qu.: 9.30 1st Qu.: 7.60 1st Qu.: 7.60000 1st Qu.: 66.00
## Median : 12.30 Median : 10.40 Median : 14.70012 Median : 80.00
## Mean : 13.95 Mean : 11.51 Mean : 18.17040 Mean : 82.66
## 3rd Qu.: 16.90 3rd Qu.: 14.10 3rd Qu.: 24.80000 3rd Qu.: 94.55
## Max. :155.60 Max. :128.30 Max. :149.30012 Max. :230.00
## NA's :146 NA's :146
## Weight_min
## Min. : 34.60
## 1st Qu.: 65.00
## Median : 77.70
## Mean : 80.86
## 3rd Qu.: 91.95
## Max. :230.00
## NA's :146
head(icu_patients_df1)
## RecordID Length_of_stay SAPS1 SOFA Survival in_hospital_death Days Status Age
## 1 132539 5 6 1 NA 0 2408 FALSE 54
## 2 132540 8 16 8 NA 0 2408 FALSE 76
## 3 132541 19 21 11 NA 0 2408 FALSE 44
## 4 132543 9 7 1 575 0 575 TRUE 68
## 5 132545 4 17 2 918 0 918 TRUE 88
## 6 132547 6 14 11 1637 0 1637 TRUE 64
## Albumin_diff Albumin_max Albumin_min ALP_diff ALP_max ALP_min ALT_diff
## 1 0.2186633 3.2 3.1 118.147964 214 202 80.44617
## 2 0.8813367 2.1 2.2 252.147964 338 348 94.44617
## 3 0.6813367 2.7 2.3 31.147964 127 105 45.44617
## 4 1.4186633 4.4 4.4 9.147964 105 105 108.44617
## 5 0.3813367 2.7 2.6 56.852036 39 78 96.44617
## 6 0.4186633 3.4 3.3 5.147964 101 101 75.44617
## ALT_max ALT_min AST_diff AST_max AST_min Bilirubin_diff Bilirubin_max
## 1 40 75 131.35271 38 53 1.464039 0.4
## 2 206 26 116.35271 53 74 1.564039 1.2
## 3 91 75 65.64729 235 164 1.235961 3.0
## 4 12 12 154.35271 15 15 1.564039 0.2
## 5 24 32 154.35271 15 97 1.364039 0.4
## 6 60 45 122.35271 162 47 1.364039 0.4
## Bilirubin_min BUN_diff BUN_max BUN_min Cholesterol_diff Cholesterol_max
## 1 0.3 11.527053 13 13 16.42276 154
## 2 0.2 8.527053 18 16 28.42276 139
## 3 2.8 21.527053 8 3 56.42276 111
## 4 0.2 4.527053 23 20 37.42276 127
## 5 0.9 20.472947 45 45 55.42276 104
## 6 0.4 9.527053 19 15 55.57724 212
## Cholesterol_min Creatinine_diff Creatinine_max Creatinine_min DiasABP_diff
## 1 140 0.4324463 0.8 0.8 NA
## 2 128 0.4324463 1.2 0.8 26.54421
## 3 100 0.9324463 0.4 0.3 NA
## 4 119 0.5324463 0.9 0.7 NA
## 5 101 0.2324463 1.0 1.0 NA
## 6 212 0.3324463 1.4 0.9 20.45579
## DiasABP_max DiasABP_min FiO2_diff FiO2_max FiO2_min GCS_diff GCS_max GCS_min
## 1 NA NA 0.05192012 0.5 0.5 3.755971 15 15
## 2 81 32 0.44807988 1.0 0.4 8.244029 15 3
## 3 NA NA 0.44807988 1.0 0.5 6.244029 8 5
## 4 NA NA 0.44807988 1.0 0.4 3.755971 15 14
## 5 NA NA 0.15192012 0.4 0.5 3.755971 15 15
## 6 79 55 0.05192012 0.5 0.5 4.244029 9 7
## Gender Glucose_diff Glucose_max Glucose_min HCO3_diff HCO3_max HCO3_min
## 1 Female 65.14446 205 205 3.227452 26 26
## 2 Male 34.85554 105 105 1.772548 22 21
## 3 Female 20.85554 141 119 3.227452 26 24
## 4 Male 33.85554 129 106 5.227452 28 27
## 5 Female 26.85554 113 113 4.772548 18 18
## 6 Male 124.14446 264 197 3.772548 19 19
## HCT_diff HCT_max HCT_min Height HR_diff HR_max HR_min
## 1 2.739871 33.7 33.5 NA 29.077891 80 58
## 2 6.260129 29.7 24.7 175.3 7.077891 88 80
## 3 4.260129 28.5 26.7 NA 30.077891 113 57
## 4 10.339871 41.3 36.1 180.3 30.077891 88 57
## 5 8.360129 30.8 22.6 NA 20.077891 94 67
## 6 10.639871 41.6 36.8 180.3 16.077891 91 71
## ICUType K_diff K_max K_min Lactate_diff Lactate_max
## 1 Surgical ICU 0.2647934 4.4 4.4 0.9964037 1.9
## 2 Cardiac Surgery Recovery Unit 0.1647934 4.3 4.3 1.4964037 2.9
## 3 Medical ICU 4.4647934 8.6 3.3 1.4964037 1.9
## 4 Medical ICU 0.1352066 4.2 4.0 1.5964037 1.2
## 5 Medical ICU 1.8647934 6.0 3.8 0.8964037 2.0
## 6 Coronary Care Unit 0.9647934 5.1 3.8 1.8964037 0.9
## Lactate_min MAP_diff MAP_max MAP_min Mg_diff Mg_max Mg_min Na_diff Na_max
## 1 1.8 31.23164 109 56 0.4842982 1.5 1.5 2.2066071 137
## 2 1.3 34.76836 100 43 1.1157018 3.1 1.9 0.2066071 139
## 3 1.3 53.23164 131 71 0.6842982 1.9 1.3 2.2066071 140
## 4 1.5 24.23164 102 72 0.1157018 2.1 2.1 1.7933929 141
## 5 1.9 9.76836 78 68 0.4842982 1.5 1.5 0.7933929 140
## 6 1.3 24.23164 102 62 0.2842982 1.7 1.7 2.2066071 141
## Na_min NIDiasABP_diff NIDiasABP_max NIDiasABP_min NIMAP_diff NIMAP_max
## 1 137 17.49101 65 40 17.04069 92.33
## 2 139 19.49101 65 38 26.38069 86.33
## 3 137 37.50899 95 66 34.28931 110.00
## 4 140 23.50899 81 54 24.98931 100.70
## 5 140 38.50899 96 29 29.98931 105.70
## 6 137 31.50899 89 52 26.58931 102.30
## NIMAP_min NISysABP_diff NISysABP_max NISysABP_min PaCO2_diff PaCO2_max
## 1 58.67 40.30125 157 96 3.335797 37
## 2 49.33 44.69875 129 72 7.335797 41
## 3 83.33 33.30125 150 111 3.335797 37
## 4 73.00 23.30125 140 102 9.335797 38
## 5 63.67 39.30125 156 119 6.335797 34
## 6 61.67 35.69875 129 81 5.335797 45
## PaCO2_min PaO2_diff PaO2_max PaO2_min pH_diff pH_max pH_min Platelets_diff
## 1 38 47.61789 186 111 0.12011376 7.49 7.43 31.23069
## 2 33 286.38211 445 89 0.08011376 7.45 7.34 36.23069
## 3 37 93.61789 65 65 0.14011376 7.51 7.51 117.76931
## 4 31 94.61789 148 64 0.14011376 7.51 7.47 201.23069
## 5 35 80.61789 78 84 0.04011376 7.38 7.41 80.76931
## 6 35 80.61789 101 78 0.07988624 7.40 7.29 86.23069
## Platelets_max Platelets_min RespRate_diff RespRate_max RespRate_min SaO2_diff
## 1 221 221 7.34858 24 12 3.246079
## 2 226 164 16.65142 36 11 1.753921
## 3 84 72 13.65142 33 18 2.246079
## 4 391 315 7.34858 21 12 1.753921
## 5 109 109 6.65142 26 15 3.246079
## 6 276 219 27.65142 47 20 1.246079
## SaO2_max SaO2_min SysABP_diff SysABP_max SysABP_min Temp_diff Temp_max
## 1 98 94 NA NA NA 1.874083 38.1
## 2 99 97 50.3105 135 66 2.474083 37.9
## 3 95 95 NA NA NA 2.025917 39.0
## 4 99 97 NA NA NA 1.874083 36.7
## 5 97 94 NA NA NA 1.174083 37.8
## 6 97 96 43.3105 152 73 1.174083 37.8
## Temp_min TroponinI_diff TroponinI_max TroponinI_min TroponinT_diff
## 1 35.1 5.1429448 1.0 0.3 0.4785006
## 2 34.5 26.2570552 31.7 16.1 0.6485006
## 3 36.7 31.2570552 33.4 36.7 0.8814994
## 4 35.1 0.8570552 5.9 6.3 0.6485006
## 5 35.8 0.1570552 5.6 5.6 0.6085006
## 6 35.8 4.1429448 1.3 1.3 0.6385006
## TroponinT_max TroponinT_min Urine_diff Urine_max Urine_min WBC_diff WBC_max
## 1 0.58 0.19 800.78242 900 30 0.9331524 11.2
## 2 0.43 0.02 670.78242 770 0 4.7331524 13.1
## 3 1.55 1.41 310.78242 410 30 8.4331524 4.2
## 4 0.10 0.02 600.78242 700 100 3.3331524 11.5
## 5 0.06 0.37 83.21758 150 16 8.3331524 3.8
## 6 0.03 0.10 1100.78242 1200 40 11.8668476 24.0
## WBC_min Weight_diff Weight_max Weight_min
## 1 11.2 NA NA NA
## 2 7.4 4.699878 80.6 76.0
## 3 3.7 23.999878 56.7 56.7
## 4 8.8 3.900122 84.6 84.6
## 5 3.8 NA NA NA
## 6 14.4 33.300122 114.0 114.0
# Initial variables used to predict in-hospital death
# Based on background knowledge
# Albumin_min : low albumin assoc with malnutrition
# Bilirubin_max: may indicate liver failure and also is included in SOFA scoring
# BUN_max : high urea assoc with renal impairment
# Creatinine_max : high creatinine assoc with renal impairment
# FiO2_max : high oxygen requirements indicate lung pathology
# GCS_min : low GCS indicates head pathology
# Glucose_min : both hypo and hyperglycaemia can contribute to mortality
# Glucose_max
# HCT_min : HCT or haemoglobin assoc with mortality
# HR_min : too high or too low HR can cause cardiac issues
# HR_max
# K_min : both hypo and hyperkalaemia can indicate pathology
# K_max
# Lactate_max : high lactate is a marker of inadequate organ perfusion
# Mg_min : low magnesium could result in arrhythmias
# MAP_min : hypoperfusion leads to morbidity
# MAP_max : hyperperfusion / dysregulation of circulation could lead to morbidity
# MAP_diff
# MechVent : mech vent associated with mortality
# Na_min : both hypo and hypernatremia can lead to brain oedema / damage / seizures
# Na_max
# PaCO2_min
# PaCO2_max : hypercapnia can result from ventilatory failure
# PaO2_min : because oxygen is important
# pH_min : both acidaemia and alkaemia indicates systemic or renal pathology
# pH_max
# Platelets_min : low platelets indicate higher risk of bleeding
# RespRate_max : high RR is related to mortality
# SaO2_min : because oxygen is important
# Temp_min : hypothermia / hyperthermia may lead to morbiditiy
# Temp_max
# TropT_max : indicates myocardial injury
# Urine_min : low urine output indicates renal impairment
# WBC_max : presence of infection
# Weight : whether obesity has any contributing factors
# Non clinical variables:
# age
# gender
# height (and therefore BMI)
# ICUType
# Length_of_stay
# SAPS1
# ICUType
# Variables I am unsure of clinical significant:
# ALP / ALT / AST
# HCO3
# add in BMI variable
# icu_patients_df1$BMI <- (icu_patients_df1$Weight_min) / (0.01*icu_patients_df1$Height)^2
# summary(icu_patients_df1$Weight_min)
# summary(icu_patients_df1$Height) #max height is 426.7cm!?
# summary(icu_patients_df1$BMI)
# BMI actually gives no meaningful data - remove this code!
# Basic EDA for each variable
library(ggplot2)
table(icu_patients_df1$in_hospital_death) #297 deaths out of 2016 observations = 14.4%
##
## 0 1
## 1764 297
# Patients who died had lower albumin
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Albumin_min))+ geom_boxplot()
# Patients who died had higher bilirubin
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Bilirubin_max))+ geom_boxplot()
# Patients who died had higher urea
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = BUN_max))+ geom_boxplot()
# Patients who died had higher creatinine
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Creatinine_max))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = FiO2_max))+ geom_boxplot()
# Patients who died had lower GCS
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = GCS_min))+ geom_boxplot()
# Little to no difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Glucose_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Glucose_max))+ geom_boxplot()
# Patients who died had slightly lower HCT
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = HCT_min))+ geom_boxplot()
# Patients who died had slightly higher HR_min and HR_max
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = HR_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = HR_max))+ geom_boxplot()
# Little to no difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = K_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = K_max))+ geom_boxplot()
# Patients who died had slightly higher lactate
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Lactate_max))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Mg_min))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = MAP_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = MAP_max))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = MAP_diff))+ geom_boxplot()
# MechVent row was deleted from the data
# ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = MechVent))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Na_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Na_max))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = PaCO2_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = PaCO2_max))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = PaO2_min))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = pH_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = pH_max))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Platelets_min))+ geom_boxplot()
# Patients who died had higher RR
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = RespRate_max))+ geom_boxplot()
# More outliers in patients who survived with low saturations
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = SaO2_min))+ geom_boxplot()
# Patients who died had slightly lower tempeatures
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Temp_min))+ geom_boxplot()
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Temp_max))+ geom_boxplot()
# Patients who died possibly slightly higher tropT
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = TroponinT_max))+ geom_boxplot()
# Patients who died had slightly less urine output
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Urine_min))+ geom_boxplot()
# Patients who died had slightly higher WBC
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = WBC_max))+ geom_boxplot()
# Not uninterpretable data
# ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = BMI))+ geom_boxplot()
# Using weight instead, roughly the same
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Weight_min))+ geom_boxplot()
## Warning: Removed 146 rows containing non-finite values (stat_boxplot).
# Patients who died had older age
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Age))+ geom_boxplot()
# No difference
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = Length_of_stay))+ geom_boxplot()
# Patients who died had higher SAPS1 and SOFA scores
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = SAPS1))+ geom_boxplot()
## Warning: Removed 96 rows containing non-finite values (stat_boxplot).
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = SOFA))+ geom_boxplot()
# cardiac surgery recovery unit have a smaller death circle compared to the other 3 ICU units
# ie less proportion of in hospital deaths compared to alive
ggplot(data=icu_patients_df1, mapping = aes(x = in_hospital_death=="1", y = ICUType)) +
geom_count(aes(size = after_stat(prop), group = ICUType)) +
scale_size_area(max_size = 50)
# univariate comparisons above
# removed: Mg_min, Na_min, Na_max, MAP_diff, MAP_max
### significant variables ###
minAlbumin_glm <- glm(in_hospital_death ~ Albumin_min, data=icu_patients_df1, family="binomial")
summary(minAlbumin_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Albumin_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7948 -0.5887 -0.5385 -0.4595 2.2842
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.36392 0.29389 -1.238 0.216
## Albumin_min -0.48186 0.09987 -4.825 1.4e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1676.0 on 2059 degrees of freedom
## AIC: 1680
##
## Number of Fisher Scoring iterations: 4
maxBili_glm <- glm(in_hospital_death ~ Bilirubin_max, data=icu_patients_df1, family="binomial")
summary(maxBili_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Bilirubin_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.3889 -0.5421 -0.5363 -0.5321 2.0174
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.90053 0.06866 -27.679 < 2e-16 ***
## Bilirubin_max 0.05692 0.01135 5.013 5.35e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1676.8 on 2059 degrees of freedom
## AIC: 1680.8
##
## Number of Fisher Scoring iterations: 4
maxUrea_glm <- glm(in_hospital_death ~ BUN_max, data=icu_patients_df1, family="binomial")
summary(maxUrea_glm)
##
## Call:
## glm(formula = in_hospital_death ~ BUN_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0462 -0.5269 -0.4789 -0.4443 2.2309
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.492189 0.103693 -24.034 <2e-16 ***
## BUN_max 0.022610 0.002347 9.634 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1607.3 on 2059 degrees of freedom
## AIC: 1611.3
##
## Number of Fisher Scoring iterations: 4
maxCr_glm <- glm(in_hospital_death ~ Creatinine_max, data=icu_patients_df1, family="binomial")
summary(maxCr_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Creatinine_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8627 -0.5433 -0.5270 -0.5151 2.0633
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.05087 0.08430 -24.328 < 2e-16 ***
## Creatinine_max 0.16325 0.03135 5.208 1.91e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1674.4 on 2059 degrees of freedom
## AIC: 1678.4
##
## Number of Fisher Scoring iterations: 4
minGCS_glm <- glm(in_hospital_death ~ GCS_min, data=icu_patients_df1, family="binomial")
summary(minGCS_glm)
##
## Call:
## glm(formula = in_hospital_death ~ GCS_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6238 -0.6238 -0.5394 -0.4853 2.0964
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.40261 0.12298 -11.405 < 2e-16 ***
## GCS_min -0.04514 0.01317 -3.426 0.000612 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1687.7 on 2059 degrees of freedom
## AIC: 1691.7
##
## Number of Fisher Scoring iterations: 4
maxGlu_glm <- glm(in_hospital_death ~ Glucose_max, data=icu_patients_df1, family="binomial")
summary(maxGlu_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Glucose_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4117 -0.5572 -0.5343 -0.5162 2.0872
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.1865370 0.1202802 -18.179 < 2e-16 ***
## Glucose_max 0.0023817 0.0005819 4.093 4.25e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1684.2 on 2059 degrees of freedom
## AIC: 1688.2
##
## Number of Fisher Scoring iterations: 4
maxHR_glm <- glm(in_hospital_death ~ HR_max, data=icu_patients_df1, family="binomial")
summary(maxHR_glm)
##
## Call:
## glm(formula = in_hospital_death ~ HR_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1194 -0.5733 -0.5402 -0.5067 2.1517
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.707555 0.303251 -8.928 < 2e-16 ***
## HR_max 0.008565 0.002707 3.164 0.00156 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1689.9 on 2059 degrees of freedom
## AIC: 1693.9
##
## Number of Fisher Scoring iterations: 4
maxLactate_glm <- glm(in_hospital_death ~ Lactate_max, data=icu_patients_df1, family="binomial")
summary(maxLactate_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Lactate_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1726 -0.5544 -0.5200 -0.4939 2.1212
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.1932 0.1005 -21.820 < 2e-16 ***
## Lactate_max 0.1372 0.0244 5.625 1.86e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1669.5 on 2059 degrees of freedom
## AIC: 1673.5
##
## Number of Fisher Scoring iterations: 4
minPaCO2_glm <- glm(in_hospital_death ~ PaCO2_min, data=icu_patients_df1, family="binomial")
summary(minPaCO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ PaCO2_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8026 -0.5767 -0.5530 -0.5081 2.4175
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.960864 0.295925 -3.247 0.00117 **
## PaCO2_min -0.022689 0.008111 -2.797 0.00516 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1691.4 on 2059 degrees of freedom
## AIC: 1695.4
##
## Number of Fisher Scoring iterations: 4
minpH_glm <- glm(in_hospital_death ~ pH_min, data=icu_patients_df1, family="binomial")
summary(minpH_glm)
##
## Call:
## glm(formula = in_hospital_death ~ pH_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9980 -0.5733 -0.5358 -0.4868 2.2874
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 19.5912 4.8996 3.998 6.37e-05 ***
## pH_min -2.9197 0.6699 -4.358 1.31e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1677.4 on 2059 degrees of freedom
## AIC: 1681.4
##
## Number of Fisher Scoring iterations: 4
maxRR_glm <- glm(in_hospital_death ~ RespRate_max, data=icu_patients_df1, family="binomial")
summary(maxRR_glm)
##
## Call:
## glm(formula = in_hospital_death ~ RespRate_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4489 -0.5679 -0.5233 -0.4817 2.1771
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.835656 0.235885 -12.021 < 2e-16 ***
## RespRate_max 0.035250 0.007412 4.756 1.98e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1677.8 on 2059 degrees of freedom
## AIC: 1681.8
##
## Number of Fisher Scoring iterations: 4
minTemp_glm <- glm(in_hospital_death ~ Temp_min, data=icu_patients_df1, family="binomial")
summary(minTemp_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Temp_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8918 -0.5741 -0.5409 -0.4973 2.2040
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.4599 2.3473 3.178 0.00148 **
## Temp_min -0.2571 0.0654 -3.931 8.45e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1684.3 on 2059 degrees of freedom
## AIC: 1688.3
##
## Number of Fisher Scoring iterations: 4
maxTropT_glm <- glm(in_hospital_death ~ TroponinT_max, data=icu_patients_df1, family="binomial")
summary(maxTropT_glm)
##
## Call:
## glm(formula = in_hospital_death ~ TroponinT_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0740 -0.5503 -0.5430 -0.5416 1.9965
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.84719 0.06917 -26.705 <2e-16 ***
## TroponinT_max 0.06537 0.02638 2.478 0.0132 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1694.1 on 2059 degrees of freedom
## AIC: 1698.1
##
## Number of Fisher Scoring iterations: 4
minUrine_glm <- glm(in_hospital_death ~ Urine_min, data=icu_patients_df1, family="binomial")
summary(minUrine_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Urine_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6034 -0.5952 -0.5631 -0.5105 2.9438
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.610937 0.076052 -21.182 < 2e-16 ***
## Urine_min -0.006020 0.001787 -3.369 0.000756 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1683.1 on 2059 degrees of freedom
## AIC: 1687.1
##
## Number of Fisher Scoring iterations: 5
maxWBC_glm <- glm(in_hospital_death ~ WBC_max, data=icu_patients_df1, family="binomial")
summary(maxWBC_glm)
##
## Call:
## glm(formula = in_hospital_death ~ WBC_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2674 -0.5631 -0.5475 -0.5326 2.0545
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.982652 0.118080 -16.791 <2e-16 ***
## WBC_max 0.014086 0.006859 2.054 0.04 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1695.7 on 2059 degrees of freedom
## AIC: 1699.7
##
## Number of Fisher Scoring iterations: 4
age_glm <- glm(in_hospital_death ~ Age, data=icu_patients_df1, family="binomial")
summary(age_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Age, family = "binomial", data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7522 -0.6264 -0.5111 -0.3919 2.5135
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.761624 0.303337 -12.401 < 2e-16 ***
## Age 0.029376 0.004229 6.947 3.73e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1644.9 on 2059 degrees of freedom
## AIC: 1648.9
##
## Number of Fisher Scoring iterations: 5
gender_glm <- glm(in_hospital_death ~ Gender, data=icu_patients_df1, family="binomial")
summary(gender_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Gender, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5612 -0.5612 -0.5553 -0.5553 1.9728
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.76894 0.09381 -18.856 <2e-16 ***
## GenderMale -0.02281 0.12615 -0.181 0.856
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.7 on 2059 degrees of freedom
## AIC: 1703.7
##
## Number of Fisher Scoring iterations: 4
icuType_glm <- glm(in_hospital_death ~ ICUType, data=icu_patients_df1, family="binomial")
summary(icuType_glm)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6402 -0.6402 -0.5615 -0.3458 2.3861
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.6463 0.1576 -10.443 < 2e-16 ***
## ICUTypeCardiac Surgery Recovery Unit -1.1407 0.2563 -4.451 8.55e-06 ***
## ICUTypeMedical ICU 0.1653 0.1824 0.906 0.365
## ICUTypeSurgical ICU -0.1214 0.2001 -0.607 0.544
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1655.3 on 2057 degrees of freedom
## AIC: 1663.3
##
## Number of Fisher Scoring iterations: 5
SAPS_glm <- glm(in_hospital_death ~ SAPS1, data=icu_patients_df1, family="binomial")
summary(SAPS_glm)
##
## Call:
## glm(formula = in_hospital_death ~ SAPS1, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.3834 -0.5894 -0.4662 -0.3448 2.6278
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.79727 0.23888 -15.896 <2e-16 ***
## SAPS1 0.12558 0.01338 9.384 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1627.0 on 1964 degrees of freedom
## Residual deviance: 1530.8 on 1963 degrees of freedom
## (96 observations deleted due to missingness)
## AIC: 1534.8
##
## Number of Fisher Scoring iterations: 5
SOFA_glm <- glm(in_hospital_death ~ SOFA, data=icu_patients_df1, family="binomial")
summary(SOFA_glm)
##
## Call:
## glm(formula = in_hospital_death ~ SOFA, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0747 -0.5835 -0.4771 -0.3623 2.4609
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.83453 0.14090 -20.117 <2e-16 ***
## SOFA 0.14378 0.01539 9.342 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1607.5 on 2059 degrees of freedom
## AIC: 1611.5
##
## Number of Fisher Scoring iterations: 5
### not significant variables but clinically relevant ###
maxFiO2_glm <- glm(in_hospital_death ~ FiO2_max, data=icu_patients_df1, family="binomial")
summary(maxFiO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ FiO2_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5634 -0.5634 -0.5555 -0.5504 1.9898
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.8614 0.2102 -8.856 <2e-16 ***
## FiO2_max 0.1011 0.2533 0.399 0.69
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.5 on 2059 degrees of freedom
## AIC: 1703.5
##
## Number of Fisher Scoring iterations: 4
minHCT_glm <- minGlu_glm <- glm(in_hospital_death ~ HCT_min, data=icu_patients_df1, family="binomial")
summary(minHCT_glm)
##
## Call:
## glm(formula = in_hospital_death ~ HCT_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6521 -0.5743 -0.5512 -0.5203 2.1340
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.16563 0.33501 -3.479 0.000503 ***
## HCT_min -0.02064 0.01111 -1.857 0.063306 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1696.2 on 2059 degrees of freedom
## AIC: 1700.2
##
## Number of Fisher Scoring iterations: 4
maxK_glm <- glm(in_hospital_death ~ K_max, data=icu_patients_df1, family="binomial")
summary(maxK_glm) # not sigifnicant but has been included in the SAPS
##
## Call:
## glm(formula = in_hospital_death ~ K_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2402 -0.5620 -0.5512 -0.5380 2.0561
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.24634 0.28233 -7.956 1.77e-15 ***
## K_max 0.10449 0.06153 1.698 0.0895 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1697.0 on 2059 degrees of freedom
## AIC: 1701
##
## Number of Fisher Scoring iterations: 4
minMAP_glm <- glm(in_hospital_death ~ MAP_min, data=icu_patients_df1, family="binomial")
summary(minMAP_glm)
##
## Call:
## glm(formula = in_hospital_death ~ MAP_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6583 -0.5674 -0.5551 -0.5341 2.4214
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.413112 0.257434 -5.489 4.04e-08 ***
## MAP_min -0.005926 0.004051 -1.463 0.143
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1697.4 on 2059 degrees of freedom
## AIC: 1701.4
##
## Number of Fisher Scoring iterations: 4
maxPaCO2_glm <- glm(in_hospital_death ~ PaCO2_max, data=icu_patients_df1, family="binomial")
summary(maxPaCO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ PaCO2_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6389 -0.5715 -0.5556 -0.5300 2.1798
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.32279 0.27600 -4.793 1.65e-06 ***
## PaCO2_max -0.01017 0.00601 -1.691 0.0908 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1696.7 on 2059 degrees of freedom
## AIC: 1700.7
##
## Number of Fisher Scoring iterations: 4
minPaO2_glm <- glm(in_hospital_death ~ PaO2_min, data=icu_patients_df1, family="binomial")
summary(minPaO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ PaO2_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5747 -0.5632 -0.5591 -0.5484 2.0765
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.702103 0.143274 -11.880 <2e-16 ***
## PaO2_min -0.000757 0.001235 -0.613 0.54
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.3 on 2059 degrees of freedom
## AIC: 1703.3
##
## Number of Fisher Scoring iterations: 4
minPlt_glm <- glm(in_hospital_death ~ Platelets_min, data=icu_patients_df1, family="binomial")
summary(minPlt_glm) # not significant but has been included in the SOFA score
##
## Call:
## glm(formula = in_hospital_death ~ Platelets_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6122 -0.5735 -0.5558 -0.5260 2.2141
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.5693181 0.1352494 -11.603 <2e-16 ***
## Platelets_min -0.0010963 0.0006322 -1.734 0.0829 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1696.6 on 2059 degrees of freedom
## AIC: 1700.6
##
## Number of Fisher Scoring iterations: 4
minSaO2_glm <- glm(in_hospital_death ~ SaO2_min, data=icu_patients_df1, family="binomial")
summary(minSaO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ SaO2_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5753 -0.5578 -0.5575 -0.5573 1.9703
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.680258 1.535685 -1.094 0.274
## SaO2_min -0.001057 0.016011 -0.066 0.947
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.7 on 2059 degrees of freedom
## AIC: 1703.7
##
## Number of Fisher Scoring iterations: 4
minWeight_glm <- glm(in_hospital_death ~ Weight_min, data=icu_patients_df1, family="binomial")
summary(minWeight_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Weight_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6288 -0.5816 -0.5622 -0.5305 2.1416
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.343757 0.241289 -5.569 2.56e-08 ***
## Weight_min -0.005110 0.002943 -1.736 0.0826 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1604.2 on 1914 degrees of freedom
## Residual deviance: 1601.0 on 1913 degrees of freedom
## (146 observations deleted due to missingness)
## AIC: 1605
##
## Number of Fisher Scoring iterations: 4
LOS_glm <- glm(in_hospital_death ~ Length_of_stay, data=icu_patients_df1, family="binomial")
summary(LOS_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Length_of_stay, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8655 -0.5594 -0.5466 -0.5413 2.0058
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.882249 0.089155 -21.112 <2e-16 ***
## Length_of_stay 0.007099 0.004331 1.639 0.101
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1697.2 on 2059 degrees of freedom
## AIC: 1701.2
##
## Number of Fisher Scoring iterations: 4
### not significant and may not be clinically relevant ###
minGlu_glm <- glm(in_hospital_death ~ Glucose_min, data=icu_patients_df1, family="binomial")
summary(minGlu_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Glucose_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7799 -0.5613 -0.5522 -0.5428 2.0271
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.967537 0.171241 -11.490 <2e-16 ***
## Glucose_min 0.001476 0.001253 1.178 0.239
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1698.4 on 2059 degrees of freedom
## AIC: 1702.4
##
## Number of Fisher Scoring iterations: 4
minHR_glm <- glm(in_hospital_death ~ HR_min, data=icu_patients_df1, family="binomial")
summary(minHR_glm)
##
## Call:
## glm(formula = in_hospital_death ~ HR_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6235 -0.5656 -0.5528 -0.5390 2.1087
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.108733 0.301434 -6.996 2.64e-12 ***
## HR_min 0.004520 0.004052 1.115 0.265
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1698.4 on 2059 degrees of freedom
## AIC: 1702.4
##
## Number of Fisher Scoring iterations: 4
minK_glm <- glm(in_hospital_death ~ K_min, data=icu_patients_df1, family="binomial")
summary(minK_glm)
##
## Call:
## glm(formula = in_hospital_death ~ K_min, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6024 -0.5647 -0.5546 -0.5447 2.0345
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.47413 0.42361 -3.480 0.000502 ***
## K_min -0.07804 0.10660 -0.732 0.464083
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.1 on 2059 degrees of freedom
## AIC: 1703.1
##
## Number of Fisher Scoring iterations: 4
maxpH_glm <- glm(in_hospital_death ~ pH_max, data=icu_patients_df1, family="binomial")
summary(maxpH_glm)
##
## Call:
## glm(formula = in_hospital_death ~ pH_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6684 -0.5677 -0.5523 -0.5297 2.0743
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 9.3001 7.0197 1.325 0.185
## pH_max -1.4944 0.9469 -1.578 0.115
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1697.2 on 2059 degrees of freedom
## AIC: 1701.2
##
## Number of Fisher Scoring iterations: 4
maxTemp_glm <- glm(in_hospital_death ~ Temp_max, data=icu_patients_df1, family="binomial")
summary(maxTemp_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Temp_max, family = "binomial",
## data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6077 -0.5689 -0.5549 -0.5366 2.1386
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.60419 3.08152 0.521 0.603
## Temp_max -0.08988 0.08183 -1.098 0.272
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1698.5 on 2059 degrees of freedom
## AIC: 1702.5
##
## Number of Fisher Scoring iterations: 4
# IN SUMMARY:
# Variables that are significant on univariate and clinically relevant are:
# minAlbumin, maxBili, maxUrea, maxCr, minGCS, maxGlu, maxHR, maxLactate, minPaCO2
# minpH, maxRR, minTemp, maxTropT, minUrine, maxWBC
# age, gender, icuType, SAPS, SOFA
# Variables not significant but still clinically relevant are:
# maxFiO2, minHCT, maxK, minMAP, maxPaCO2, minPaO2, minPlt, minSaO2, minWeight, LOS
# Variables not significant and may not be relevant are:
# minGlu, minHR, minK, maxpH, maxTemp
# Considering ALL variables:
# column 1 is record id
# column 2 is Length_of_stay
# column 3 is SAPS1
# column 4 is SOFA
# column 5 is survival
# column 7 is days
# column 8 is status
# column 9 is age
# column 43 is gender
# column 53 is Height
# column 57 is ICUType
# these columns should be excluded - the relevant ones will be reincluded in future models
# Split looking at all the variables by min,max,diff
# If trying to look at all variables at the same time, leads to linearity error
### min ICU data ###
minICUdata <- icu_patients_df1[, -c(1,2,3,4,5,7,8,9,43,53,57)] # remove columns as above
minICUdata <- minICUdata[, c(1, seq(from=4, to=109, by = 3))] # every third column starting from a min column
minICU_glm <- glm(in_hospital_death ~ . ,data=minICUdata, family="binomial")
summary(minICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ ., family = "binomial", data = minICUdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2955 -0.5727 -0.4072 -0.2776 2.7828
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.773e+01 9.773e+00 1.814 0.06970 .
## Albumin_min -2.524e-01 1.715e-01 -1.472 0.14113
## ALP_min 5.055e-04 1.257e-03 0.402 0.68760
## ALT_min 2.757e-05 8.303e-04 0.033 0.97352
## AST_min 4.045e-04 7.875e-04 0.514 0.60751
## Bilirubin_min 4.668e-02 2.370e-02 1.970 0.04889 *
## BUN_min 3.791e-02 6.614e-03 5.733 9.89e-09 ***
## Cholesterol_min 1.887e-03 2.871e-03 0.657 0.51087
## Creatinine_min -1.937e-01 1.019e-01 -1.901 0.05726 .
## DiasABP_min -3.576e-02 1.620e-02 -2.207 0.02731 *
## FiO2_min 4.107e-02 7.920e-01 0.052 0.95864
## GCS_min -3.161e-02 2.894e-02 -1.092 0.27471
## Glucose_min 1.368e-03 1.915e-03 0.714 0.47517
## HCO3_min 6.303e-03 3.293e-02 0.191 0.84824
## HCT_min 5.159e-02 2.041e-02 2.528 0.01146 *
## HR_min 2.609e-03 7.060e-03 0.370 0.71170
## K_min 5.810e-02 1.864e-01 0.312 0.75524
## Lactate_min 2.505e-01 9.530e-02 2.629 0.00857 **
## MAP_min 5.312e-03 1.173e-02 0.453 0.65059
## Mg_min -1.834e-01 2.599e-01 -0.706 0.48047
## Na_min -4.391e-02 2.213e-02 -1.984 0.04722 *
## NIDiasABP_min -7.657e-03 2.002e-02 -0.383 0.70205
## NIMAP_min 5.751e-03 2.884e-02 0.199 0.84197
## NISysABP_min -4.665e-03 1.165e-02 -0.400 0.68893
## PaCO2_min -1.525e-02 1.924e-02 -0.792 0.42822
## PaO2_min 4.963e-03 2.060e-03 2.409 0.01600 *
## pH_min -8.939e-01 1.288e+00 -0.694 0.48759
## Platelets_min -1.537e-03 1.223e-03 -1.257 0.20866
## RespRate_min 4.789e-02 2.834e-02 1.690 0.09110 .
## SaO2_min -2.344e-03 2.228e-02 -0.105 0.91620
## SysABP_min 1.156e-02 8.330e-03 1.387 0.16532
## Temp_min -2.360e-01 1.047e-01 -2.254 0.02420 *
## TroponinI_min 6.947e-04 9.713e-03 0.072 0.94298
## TroponinT_min 4.483e-02 7.785e-02 0.576 0.56474
## Urine_min -9.429e-03 4.344e-03 -2.171 0.02994 *
## WBC_min 7.786e-03 1.606e-02 0.485 0.62787
## Weight_min -5.260e-03 4.887e-03 -1.076 0.28176
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 824.13 on 900 degrees of freedom
## Residual deviance: 666.99 on 864 degrees of freedom
## (1160 observations deleted due to missingness)
## AIC: 740.99
##
## Number of Fisher Scoring iterations: 6
# Bilirubin_min, BUN_min, HCT_min, Lactate_min, Temp_min, Na_min, PaO2_min, Urine_min were statistically significant
### max ICU data ###
maxICUdata <- icu_patients_df1[, -c(1,2,3,4,5,7,8,9,43,53,57)] # remove columns as above
maxICUdata <- maxICUdata[, c(1, seq(from=3, to=109, by = 3))] # every third column starting from a max column
maxICU_glm <- glm(in_hospital_death ~ . ,data=maxICUdata, family="binomial")
summary(maxICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ ., family = "binomial", data = maxICUdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3984 -0.5545 -0.3605 -0.1874 2.8434
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 21.2860253 18.0303070 1.181 0.23777
## Albumin_max -0.0825905 0.1815881 -0.455 0.64924
## ALP_max 0.0015035 0.0012509 1.202 0.22938
## ALT_max -0.0008123 0.0008072 -1.006 0.31427
## AST_max 0.0006405 0.0005096 1.257 0.20878
## Bilirubin_max 0.0343710 0.0221317 1.553 0.12042
## BUN_max 0.0254793 0.0063233 4.029 5.59e-05 ***
## Cholesterol_max -0.0018775 0.0033489 -0.561 0.57504
## Creatinine_max -0.1781563 0.0855192 -2.083 0.03723 *
## DiasABP_max -0.0316657 0.0097136 -3.260 0.00111 **
## FiO2_max 0.6685025 0.4957475 1.348 0.17751
## GCS_max -0.1939523 0.0361370 -5.367 8.00e-08 ***
## Glucose_max 0.0011024 0.0011521 0.957 0.33864
## HCO3_max -0.0372768 0.0365012 -1.021 0.30714
## HCT_max -0.0096146 0.0239799 -0.401 0.68846
## HR_max 0.0028140 0.0050193 0.561 0.57505
## K_max 0.0947209 0.1390359 0.681 0.49570
## Lactate_max 0.1428551 0.0539140 2.650 0.00806 **
## MAP_max 0.0037985 0.0030598 1.241 0.21446
## Mg_max -0.3885880 0.2473900 -1.571 0.11624
## Na_max -0.0451421 0.0230563 -1.958 0.05024 .
## NIDiasABP_max 0.0131768 0.0150320 0.877 0.38071
## NIMAP_max -0.0080854 0.0200386 -0.403 0.68659
## NISysABP_max 0.0057593 0.0082049 0.702 0.48272
## PaCO2_max 0.0059686 0.0126673 0.471 0.63751
## PaO2_max -0.0011683 0.0011319 -1.032 0.30201
## pH_max 1.3750765 2.1827519 0.630 0.52871
## Platelets_max -0.0023159 0.0011068 -2.092 0.03640 *
## RespRate_max 0.0063621 0.0166524 0.382 0.70242
## SaO2_max -0.1018260 0.0597825 -1.703 0.08852 .
## SysABP_max 0.0093153 0.0055154 1.689 0.09123 .
## Temp_max -0.3690320 0.1461566 -2.525 0.01157 *
## TroponinI_max -0.0089373 0.0108825 -0.821 0.41150
## TroponinT_max 0.0463654 0.0581954 0.797 0.42561
## Urine_max -0.0009914 0.0003290 -3.013 0.00259 **
## WBC_max 0.0049944 0.0134355 0.372 0.71009
## Weight_max -0.0055211 0.0045915 -1.202 0.22919
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 824.13 on 900 degrees of freedom
## Residual deviance: 624.41 on 864 degrees of freedom
## (1160 observations deleted due to missingness)
## AIC: 698.41
##
## Number of Fisher Scoring iterations: 6
# BUN_max, Creatinine_max, GCS_max, Platelet_max, Temp_max, Urine_max were statistically significant
### diff ICU data ###
diffICUdata <- icu_patients_df1[, -c(1,2,3,4,5,7,8,9,43,53,57)] # remove columns as above
diffICUdata <- diffICUdata[, c(1, seq(from=2, to=109, by = 3))] # every third column starting from a diff column
diffICU_glm <- glm(in_hospital_death ~ . ,data=diffICUdata, family="binomial")
summary(diffICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ ., family = "binomial", data = diffICUdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7007 -0.5804 -0.4198 -0.2603 2.8621
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.025e+00 5.635e-01 -7.143 9.1e-13 ***
## Albumin_diff 6.868e-01 2.531e-01 2.714 0.00665 **
## ALP_diff 1.003e-03 1.406e-03 0.714 0.47544
## ALT_diff 2.852e-05 7.942e-04 0.036 0.97135
## AST_diff 1.865e-04 4.892e-04 0.381 0.70307
## Bilirubin_diff 4.298e-02 2.103e-02 2.044 0.04098 *
## BUN_diff 1.910e-02 6.328e-03 3.017 0.00255 **
## Cholesterol_diff 5.767e-03 4.217e-03 1.368 0.17144
## Creatinine_diff -1.058e-01 8.330e-02 -1.270 0.20419
## DiasABP_diff -1.396e-02 9.392e-03 -1.487 0.13712
## FiO2_diff 4.869e-01 6.906e-01 0.705 0.48085
## GCS_diff 4.374e-02 5.009e-02 0.873 0.38259
## Glucose_diff 1.647e-03 1.341e-03 1.229 0.21924
## HCO3_diff 2.568e-02 3.403e-02 0.754 0.45059
## HCT_diff -2.201e-02 2.829e-02 -0.778 0.43653
## HR_diff 4.489e-03 5.827e-03 0.770 0.44102
## K_diff 9.497e-02 1.626e-01 0.584 0.55930
## Lactate_diff 1.227e-01 6.884e-02 1.783 0.07460 .
## MAP_diff 1.892e-04 2.966e-03 0.064 0.94914
## Mg_diff -6.705e-01 3.348e-01 -2.003 0.04522 *
## Na_diff 5.171e-02 2.748e-02 1.882 0.05987 .
## NIDiasABP_diff 2.526e-03 1.431e-02 0.176 0.85991
## NIMAP_diff -7.757e-03 1.892e-02 -0.410 0.68189
## NISysABP_diff 1.898e-02 8.513e-03 2.229 0.02580 *
## PaCO2_diff -2.988e-04 1.391e-02 -0.021 0.98286
## PaO2_diff -1.815e-03 1.439e-03 -1.261 0.20731
## pH_diff 1.868e+00 1.760e+00 1.061 0.28853
## Platelets_diff -8.372e-04 1.255e-03 -0.667 0.50489
## RespRate_diff 1.855e-02 1.542e-02 1.203 0.22890
## SaO2_diff -1.858e-02 2.566e-02 -0.724 0.46890
## SysABP_diff 3.780e-03 5.935e-03 0.637 0.52421
## Temp_diff 2.411e-01 1.253e-01 1.924 0.05432 .
## TroponinI_diff -1.882e-02 1.114e-02 -1.689 0.09129 .
## TroponinT_diff 2.464e-02 6.472e-02 0.381 0.70346
## Urine_diff -1.119e-03 3.394e-04 -3.296 0.00098 ***
## WBC_diff -5.133e-03 1.611e-02 -0.319 0.75000
## Weight_diff 7.117e-03 5.451e-03 1.306 0.19165
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 824.13 on 900 degrees of freedom
## Residual deviance: 680.80 on 864 degrees of freedom
## (1160 observations deleted due to missingness)
## AIC: 754.8
##
## Number of Fisher Scoring iterations: 6
# Albumin_diff, Bilirubin_diff, BUN_diff, Mg_diff, NISysABP_diff, Urine_diff were statistically significant
sum(is.na(icu_patients_df1$NISysABP_diff)) # there are 453 missing values from NISysABP_diff column! removed
## [1] 453
### if you built a model that used the min/max/diff significant variables:
minmaxdiffICU_glm <- glm(in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType + Gender +
#min variables that were significant
Bilirubin_min + BUN_min + HCT_min + Lactate_min + Temp_min + Na_min + PaO2_min + Urine_min +
#max variables that were significant
BUN_max + Creatinine_max + GCS_max + Platelets_max + Temp_max + Urine_max +
#diff variables that were significant
Albumin_diff + Bilirubin_diff + BUN_diff + Mg_diff + Urine_diff # NISysABP_diff removed because of missing values
,data=icu_patients_df1, family="binomial")
summary(minmaxdiffICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Age + Length_of_stay + SOFA +
## SAPS1 + ICUType + Gender + Bilirubin_min + BUN_min + HCT_min +
## Lactate_min + Temp_min + Na_min + PaO2_min + Urine_min +
## BUN_max + Creatinine_max + GCS_max + Platelets_max + Temp_max +
## Urine_max + Albumin_diff + Bilirubin_diff + BUN_diff + Mg_diff +
## Urine_diff, family = "binomial", data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1298 -0.5279 -0.3351 -0.1930 3.0284
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 8.953e+00 4.453e+00 2.010 0.044380 *
## Age 2.150e-02 5.569e-03 3.860 0.000113 ***
## Length_of_stay -5.846e-03 5.935e-03 -0.985 0.324598
## SOFA 2.214e-02 2.558e-02 0.865 0.386846
## SAPS1 6.715e-02 2.305e-02 2.914 0.003573 **
## ICUTypeCardiac Surgery Recovery Unit -1.119e+00 3.071e-01 -3.645 0.000268 ***
## ICUTypeMedical ICU 9.494e-02 2.203e-01 0.431 0.666504
## ICUTypeSurgical ICU 6.277e-02 2.449e-01 0.256 0.797684
## GenderMale -4.140e-02 1.530e-01 -0.271 0.786661
## Bilirubin_min 1.811e-01 6.791e-02 2.667 0.007662 **
## BUN_min 4.560e-02 1.559e-02 2.924 0.003452 **
## HCT_min -4.119e-03 1.404e-02 -0.293 0.769259
## Lactate_min 1.010e-01 5.631e-02 1.794 0.072760 .
## Temp_min -1.165e-01 8.649e-02 -1.347 0.178135
## Na_min -4.704e-02 1.426e-02 -3.298 0.000973 ***
## PaO2_min 2.825e-05 1.393e-03 0.020 0.983819
## Urine_min -8.609e-04 1.913e-03 -0.450 0.652655
## BUN_max -1.251e-02 1.690e-02 -0.740 0.459170
## Creatinine_max -1.953e-01 7.145e-02 -2.734 0.006259 **
## GCS_max -1.538e-01 2.540e-02 -6.056 1.39e-09 ***
## Platelets_max -1.065e-03 6.930e-04 -1.536 0.124530
## Temp_max -1.684e-02 1.058e-01 -0.159 0.873546
## Urine_max -3.776e-03 1.873e-03 -2.016 0.043830 *
## Albumin_diff 3.366e-01 1.918e-01 1.755 0.079295 .
## Bilirubin_diff -1.445e-01 6.881e-02 -2.100 0.035732 *
## BUN_diff -7.692e-03 9.286e-03 -0.828 0.407450
## Mg_diff -5.779e-02 1.920e-01 -0.301 0.763420
## Urine_diff 3.300e-03 1.943e-03 1.699 0.089393 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1627.0 on 1964 degrees of freedom
## Residual deviance: 1274.6 on 1937 degrees of freedom
## (96 observations deleted due to missingness)
## AIC: 1330.6
##
## Number of Fisher Scoring iterations: 6
step_minmaxdiffICU_glm <- step(minmaxdiffICU_glm, trace=1)
## Start: AIC=1330.57
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Gender + Bilirubin_min + BUN_min + HCT_min + Lactate_min +
## Temp_min + Na_min + PaO2_min + Urine_min + BUN_max + Creatinine_max +
## GCS_max + Platelets_max + Temp_max + Urine_max + Albumin_diff +
## Bilirubin_diff + BUN_diff + Mg_diff + Urine_diff
##
## Df Deviance AIC
## - PaO2_min 1 1274.6 1328.6
## - Temp_max 1 1274.6 1328.6
## - Gender 1 1274.6 1328.6
## - HCT_min 1 1274.7 1328.7
## - Mg_diff 1 1274.7 1328.7
## - Urine_min 1 1274.8 1328.8
## - BUN_max 1 1275.1 1329.1
## - BUN_diff 1 1275.3 1329.3
## - SOFA 1 1275.3 1329.3
## - Length_of_stay 1 1275.6 1329.6
## - Temp_min 1 1276.4 1330.4
## <none> 1274.6 1330.6
## - Platelets_max 1 1277.0 1331.0
## - Urine_diff 1 1277.4 1331.4
## - Albumin_diff 1 1277.6 1331.6
## - Lactate_min 1 1277.8 1331.8
## - Urine_max 1 1278.6 1332.6
## - Bilirubin_diff 1 1279.3 1333.3
## - Bilirubin_min 1 1282.1 1336.1
## - Creatinine_max 1 1283.0 1337.0
## - SAPS1 1 1283.1 1337.1
## - BUN_min 1 1284.1 1338.1
## - Na_min 1 1285.1 1339.1
## - Age 1 1290.1 1344.1
## - ICUType 3 1300.6 1350.6
## - GCS_max 1 1311.5 1365.5
##
## Step: AIC=1328.57
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Gender + Bilirubin_min + BUN_min + HCT_min + Lactate_min +
## Temp_min + Na_min + Urine_min + BUN_max + Creatinine_max +
## GCS_max + Platelets_max + Temp_max + Urine_max + Albumin_diff +
## Bilirubin_diff + BUN_diff + Mg_diff + Urine_diff
##
## Df Deviance AIC
## - Temp_max 1 1274.6 1326.6
## - Gender 1 1274.6 1326.6
## - HCT_min 1 1274.7 1326.7
## - Mg_diff 1 1274.7 1326.7
## - Urine_min 1 1274.8 1326.8
## - BUN_max 1 1275.1 1327.1
## - BUN_diff 1 1275.3 1327.3
## - SOFA 1 1275.3 1327.3
## - Length_of_stay 1 1275.6 1327.6
## - Temp_min 1 1276.4 1328.4
## <none> 1274.6 1328.6
## - Platelets_max 1 1277.0 1329.0
## - Urine_diff 1 1277.5 1329.5
## - Albumin_diff 1 1277.6 1329.6
## - Lactate_min 1 1277.8 1329.8
## - Urine_max 1 1278.6 1330.6
## - Bilirubin_diff 1 1279.3 1331.3
## - Bilirubin_min 1 1282.1 1334.1
## - Creatinine_max 1 1283.0 1335.0
## - SAPS1 1 1283.1 1335.1
## - BUN_min 1 1284.1 1336.1
## - Na_min 1 1285.2 1337.2
## - Age 1 1290.1 1342.1
## - ICUType 3 1300.6 1348.6
## - GCS_max 1 1312.4 1364.4
##
## Step: AIC=1326.59
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Gender + Bilirubin_min + BUN_min + HCT_min + Lactate_min +
## Temp_min + Na_min + Urine_min + BUN_max + Creatinine_max +
## GCS_max + Platelets_max + Urine_max + Albumin_diff + Bilirubin_diff +
## BUN_diff + Mg_diff + Urine_diff
##
## Df Deviance AIC
## - Gender 1 1274.7 1324.7
## - HCT_min 1 1274.7 1324.7
## - Mg_diff 1 1274.7 1324.7
## - Urine_min 1 1274.8 1324.8
## - BUN_max 1 1275.1 1325.1
## - BUN_diff 1 1275.3 1325.3
## - SOFA 1 1275.4 1325.4
## - Length_of_stay 1 1275.6 1325.6
## <none> 1274.6 1326.6
## - Platelets_max 1 1277.0 1327.0
## - Temp_min 1 1277.1 1327.1
## - Urine_diff 1 1277.5 1327.5
## - Albumin_diff 1 1277.7 1327.7
## - Lactate_min 1 1277.8 1327.8
## - Urine_max 1 1278.7 1328.7
## - Bilirubin_diff 1 1279.3 1329.3
## - Bilirubin_min 1 1282.2 1332.2
## - Creatinine_max 1 1283.1 1333.1
## - SAPS1 1 1283.5 1333.5
## - BUN_min 1 1284.2 1334.2
## - Na_min 1 1285.2 1335.2
## - Age 1 1290.5 1340.5
## - ICUType 3 1300.6 1346.6
## - GCS_max 1 1312.7 1362.7
##
## Step: AIC=1324.67
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + HCT_min + Lactate_min + Temp_min +
## Na_min + Urine_min + BUN_max + Creatinine_max + GCS_max +
## Platelets_max + Urine_max + Albumin_diff + Bilirubin_diff +
## BUN_diff + Mg_diff + Urine_diff
##
## Df Deviance AIC
## - Mg_diff 1 1274.8 1322.8
## - HCT_min 1 1274.8 1322.8
## - Urine_min 1 1274.9 1322.9
## - BUN_max 1 1275.2 1323.2
## - BUN_diff 1 1275.3 1323.3
## - SOFA 1 1275.4 1323.4
## - Length_of_stay 1 1275.7 1323.7
## <none> 1274.7 1324.7
## - Platelets_max 1 1277.1 1325.1
## - Temp_min 1 1277.2 1325.2
## - Urine_diff 1 1277.7 1325.7
## - Albumin_diff 1 1277.8 1325.8
## - Lactate_min 1 1277.8 1325.8
## - Urine_max 1 1279.0 1327.0
## - Bilirubin_diff 1 1279.3 1327.3
## - Bilirubin_min 1 1282.2 1330.2
## - Creatinine_max 1 1283.4 1331.4
## - SAPS1 1 1283.7 1331.7
## - BUN_min 1 1284.2 1332.2
## - Na_min 1 1285.3 1333.3
## - Age 1 1290.7 1338.7
## - ICUType 3 1300.9 1344.9
## - GCS_max 1 1312.7 1360.7
##
## Step: AIC=1322.75
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + HCT_min + Lactate_min + Temp_min +
## Na_min + Urine_min + BUN_max + Creatinine_max + GCS_max +
## Platelets_max + Urine_max + Albumin_diff + Bilirubin_diff +
## BUN_diff + Urine_diff
##
## Df Deviance AIC
## - HCT_min 1 1274.8 1320.8
## - Urine_min 1 1275.0 1321.0
## - BUN_max 1 1275.4 1321.4
## - BUN_diff 1 1275.4 1321.4
## - SOFA 1 1275.5 1321.5
## - Length_of_stay 1 1275.8 1321.8
## <none> 1274.8 1322.8
## - Platelets_max 1 1277.1 1323.1
## - Temp_min 1 1277.3 1323.3
## - Albumin_diff 1 1277.8 1323.8
## - Urine_diff 1 1277.9 1323.9
## - Lactate_min 1 1277.9 1323.9
## - Urine_max 1 1279.2 1325.2
## - Bilirubin_diff 1 1279.4 1325.4
## - Bilirubin_min 1 1282.2 1328.2
## - Creatinine_max 1 1283.4 1329.4
## - SAPS1 1 1283.7 1329.7
## - BUN_min 1 1284.6 1330.6
## - Na_min 1 1285.3 1331.3
## - Age 1 1291.2 1337.2
## - ICUType 3 1301.0 1343.0
## - GCS_max 1 1312.8 1358.8
##
## Step: AIC=1320.84
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + Lactate_min + Temp_min + Na_min +
## Urine_min + BUN_max + Creatinine_max + GCS_max + Platelets_max +
## Urine_max + Albumin_diff + Bilirubin_diff + BUN_diff + Urine_diff
##
## Df Deviance AIC
## - Urine_min 1 1275.1 1319.1
## - BUN_max 1 1275.4 1319.4
## - BUN_diff 1 1275.5 1319.5
## - SOFA 1 1275.6 1319.6
## - Length_of_stay 1 1275.8 1319.8
## <none> 1274.8 1320.8
## - Platelets_max 1 1277.3 1321.3
## - Temp_min 1 1277.4 1321.4
## - Albumin_diff 1 1277.8 1321.8
## - Urine_diff 1 1277.9 1321.9
## - Lactate_min 1 1278.0 1322.0
## - Urine_max 1 1279.2 1323.2
## - Bilirubin_diff 1 1279.4 1323.4
## - Bilirubin_min 1 1282.3 1326.3
## - Creatinine_max 1 1283.7 1327.7
## - SAPS1 1 1284.5 1328.5
## - BUN_min 1 1284.7 1328.7
## - Na_min 1 1285.3 1329.3
## - Age 1 1291.2 1335.2
## - ICUType 3 1301.1 1341.1
## - GCS_max 1 1313.0 1357.0
##
## Step: AIC=1319.09
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + Lactate_min + Temp_min + Na_min +
## BUN_max + Creatinine_max + GCS_max + Platelets_max + Urine_max +
## Albumin_diff + Bilirubin_diff + BUN_diff + Urine_diff
##
## Df Deviance AIC
## - BUN_max 1 1275.7 1317.7
## - BUN_diff 1 1275.8 1317.8
## - SOFA 1 1276.0 1318.0
## - Length_of_stay 1 1276.0 1318.0
## <none> 1275.1 1319.1
## - Platelets_max 1 1277.5 1319.5
## - Temp_min 1 1277.7 1319.7
## - Albumin_diff 1 1278.0 1320.0
## - Lactate_min 1 1278.2 1320.2
## - Urine_diff 1 1278.5 1320.5
## - Bilirubin_diff 1 1279.6 1321.6
## - Urine_max 1 1279.9 1321.9
## - Bilirubin_min 1 1282.4 1324.4
## - Creatinine_max 1 1284.0 1326.0
## - SAPS1 1 1284.9 1326.9
## - BUN_min 1 1285.0 1327.0
## - Na_min 1 1285.5 1327.5
## - Age 1 1291.9 1333.9
## - ICUType 3 1301.1 1339.1
## - GCS_max 1 1313.3 1355.3
##
## Step: AIC=1317.69
## in_hospital_death ~ Age + Length_of_stay + SOFA + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + Lactate_min + Temp_min + Na_min +
## Creatinine_max + GCS_max + Platelets_max + Urine_max + Albumin_diff +
## Bilirubin_diff + BUN_diff + Urine_diff
##
## Df Deviance AIC
## - SOFA 1 1276.4 1316.4
## - Length_of_stay 1 1276.7 1316.7
## <none> 1275.7 1317.7
## - BUN_diff 1 1277.9 1317.9
## - Platelets_max 1 1278.1 1318.1
## - Temp_min 1 1278.4 1318.4
## - Albumin_diff 1 1278.7 1318.7
## - Lactate_min 1 1278.7 1318.7
## - Urine_diff 1 1279.2 1319.2
## - Bilirubin_diff 1 1280.3 1320.3
## - Urine_max 1 1280.6 1320.6
## - Bilirubin_min 1 1283.1 1323.1
## - SAPS1 1 1285.1 1325.1
## - Creatinine_max 1 1285.7 1325.7
## - Na_min 1 1286.8 1326.8
## - Age 1 1292.0 1332.0
## - ICUType 3 1301.1 1337.1
## - BUN_min 1 1298.9 1338.9
## - GCS_max 1 1315.2 1355.2
##
## Step: AIC=1316.39
## in_hospital_death ~ Age + Length_of_stay + SAPS1 + ICUType +
## Bilirubin_min + BUN_min + Lactate_min + Temp_min + Na_min +
## Creatinine_max + GCS_max + Platelets_max + Urine_max + Albumin_diff +
## Bilirubin_diff + BUN_diff + Urine_diff
##
## Df Deviance AIC
## - Length_of_stay 1 1277.3 1315.3
## <none> 1276.4 1316.4
## - BUN_diff 1 1278.6 1316.6
## - Temp_min 1 1279.1 1317.1
## - Platelets_max 1 1279.4 1317.4
## - Lactate_min 1 1279.4 1317.4
## - Albumin_diff 1 1279.5 1317.5
## - Urine_diff 1 1280.1 1318.1
## - Bilirubin_diff 1 1281.2 1319.2
## - Urine_max 1 1281.5 1319.5
## - Bilirubin_min 1 1284.3 1322.3
## - Creatinine_max 1 1286.0 1324.0
## - Na_min 1 1287.5 1325.5
## - Age 1 1292.0 1330.0
## - SAPS1 1 1292.3 1330.3
## - ICUType 3 1301.1 1335.1
## - BUN_min 1 1300.6 1338.6
## - GCS_max 1 1320.7 1358.7
##
## Step: AIC=1315.29
## in_hospital_death ~ Age + SAPS1 + ICUType + Bilirubin_min + BUN_min +
## Lactate_min + Temp_min + Na_min + Creatinine_max + GCS_max +
## Platelets_max + Urine_max + Albumin_diff + Bilirubin_diff +
## BUN_diff + Urine_diff
##
## Df Deviance AIC
## <none> 1277.3 1315.3
## - BUN_diff 1 1279.5 1315.5
## - Temp_min 1 1280.2 1316.2
## - Platelets_max 1 1280.3 1316.3
## - Lactate_min 1 1280.4 1316.4
## - Albumin_diff 1 1280.5 1316.5
## - Urine_diff 1 1280.7 1316.7
## - Bilirubin_diff 1 1282.1 1318.1
## - Urine_max 1 1282.2 1318.2
## - Bilirubin_min 1 1285.2 1321.2
## - Creatinine_max 1 1286.7 1322.7
## - Na_min 1 1288.2 1324.2
## - SAPS1 1 1292.7 1328.7
## - Age 1 1294.1 1330.1
## - ICUType 3 1302.1 1334.1
## - BUN_min 1 1301.1 1337.1
## - GCS_max 1 1321.1 1357.1
summary(step_minmaxdiffICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Age + SAPS1 + ICUType + Bilirubin_min +
## BUN_min + Lactate_min + Temp_min + Na_min + Creatinine_max +
## GCS_max + Platelets_max + Urine_max + Albumin_diff + Bilirubin_diff +
## BUN_diff + Urine_diff, family = "binomial", data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1418 -0.5343 -0.3366 -0.1915 2.9968
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 8.7050888 3.5243615 2.470 0.013512 *
## Age 0.0212706 0.0052960 4.016 5.91e-05 ***
## SAPS1 0.0727961 0.0185586 3.923 8.76e-05 ***
## ICUTypeCardiac Surgery Recovery Unit -1.0525238 0.2963255 -3.552 0.000382 ***
## ICUTypeMedical ICU 0.0893952 0.2183601 0.409 0.682251
## ICUTypeSurgical ICU 0.0445516 0.2401282 0.186 0.852811
## Bilirubin_min 0.1833928 0.0673292 2.724 0.006453 **
## BUN_min 0.0361351 0.0076288 4.737 2.17e-06 ***
## Lactate_min 0.0978048 0.0552595 1.770 0.076741 .
## Temp_min -0.1315688 0.0774720 -1.698 0.089456 .
## Na_min -0.0470812 0.0140568 -3.349 0.000810 ***
## Creatinine_max -0.1997437 0.0695562 -2.872 0.004083 **
## GCS_max -0.1578605 0.0238632 -6.615 3.71e-11 ***
## Platelets_max -0.0011412 0.0006751 -1.690 0.090977 .
## Urine_max -0.0039957 0.0018065 -2.212 0.026982 *
## Albumin_diff 0.3415711 0.1902130 1.796 0.072538 .
## Bilirubin_diff -0.1448016 0.0684550 -2.115 0.034406 *
## BUN_diff -0.0117014 0.0078467 -1.491 0.135894
## Urine_diff 0.0035104 0.0018834 1.864 0.062339 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1627.0 on 1964 degrees of freedom
## Residual deviance: 1277.3 on 1946 degrees of freedom
## (96 observations deleted due to missingness)
## AIC: 1315.3
##
## Number of Fisher Scoring iterations: 6
# predictors left behind after step() are:
# Age, SAPS1, ICUType
# Albumin_diff
# Bilirubin_min, Bilirubin_diff
# BUN_min, BUN_diff
# Creatinine_max
# GCS_max
# Lactate_min
# Na_min
# Platelets_max
# Temp_min
# Urine_max, Urine_diff
finalICU_glm <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# predictors that are clinically relevant but not included in above
# baseline demographics should be included even if not significant
Gender + Length_of_stay + Weight_min +
SOFA + # an indicator of how well SOFA score determines mortality independent to its components
# other clinical relevance
Albumin_min + # low albumin indicates malnutrition or liver failure
Glucose_max + # hyperglycaemia is a stress response
HCT_min + # low HCT = anaemia
HR_max + # tachycardia may indicate septic shock / inflammation
PaO2_min + # hypoxia = inadequate organ perfusion/oxygenation
PaCO2_min + #hypercapnia = respiratory / ventilation failure
pH_min + # indicates acidaemia / inadequate organ perfusion
RespRate_max + # indicates respiratory failure
TroponinT_max + # indicates myocardial damage
WBC_max # indicates infection
,data=icu_patients_df1, family="binomial")
summary(finalICU_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff +
## Bilirubin_min + Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
## GCS_max + Lactate_min + Na_min + Platelets_max + Temp_min +
## Urine_max + Urine_diff + Gender + Length_of_stay + Weight_min +
## SOFA + Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max,
## family = "binomial", data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1864 -0.5256 -0.3190 -0.1726 3.0839
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 11.3197869 6.3092397 1.794 0.07279 .
## Age 0.0261029 0.0059357 4.398 1.09e-05 ***
## SAPS1 0.0535276 0.0242680 2.206 0.02741 *
## ICUTypeCardiac Surgery Recovery Unit -0.9149479 0.3272024 -2.796 0.00517 **
## ICUTypeMedical ICU 0.1330375 0.2428895 0.548 0.58388
## ICUTypeSurgical ICU 0.3070023 0.2676527 1.147 0.25137
## Albumin_diff 0.2474274 0.2018893 1.226 0.22036
## Bilirubin_min 0.2069290 0.0698005 2.965 0.00303 **
## Bilirubin_diff -0.1706431 0.0707418 -2.412 0.01586 *
## BUN_min 0.0407697 0.0081139 5.025 5.04e-07 ***
## BUN_diff -0.0151092 0.0082071 -1.841 0.06562 .
## Creatinine_max -0.1787734 0.0720629 -2.481 0.01311 *
## GCS_max -0.1497749 0.0270334 -5.540 3.02e-08 ***
## Lactate_min 0.0564987 0.0599373 0.943 0.34587
## Na_min -0.0494656 0.0152171 -3.251 0.00115 **
## Platelets_max -0.0011105 0.0007713 -1.440 0.14992
## Temp_min -0.1117099 0.0828565 -1.348 0.17758
## Urine_max -0.0037421 0.0019103 -1.959 0.05012 .
## Urine_diff 0.0032488 0.0019892 1.633 0.10243
## GenderMale -0.0517044 0.1608977 -0.321 0.74795
## Length_of_stay -0.0076162 0.0061227 -1.244 0.21353
## Weight_min -0.0049457 0.0039722 -1.245 0.21310
## SOFA 0.0197980 0.0271001 0.731 0.46505
## Albumin_min -0.1680877 0.1292172 -1.301 0.19332
## Glucose_max 0.0002854 0.0007949 0.359 0.71958
## HCT_min -0.0082671 0.0150655 -0.549 0.58318
## HR_max 0.0077495 0.0034199 2.266 0.02345 *
## PaO2_min 0.0006981 0.0014372 0.486 0.62715
## PaCO2_min 0.0078284 0.0096972 0.807 0.41950
## pH_min -0.4944507 0.7266460 -0.680 0.49622
## RespRate_max 0.0114649 0.0103635 1.106 0.26861
## TroponinT_max 0.0534140 0.0343041 1.557 0.11945
## WBC_max -0.0090501 0.0105882 -0.855 0.39270
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1549.9 on 1854 degrees of freedom
## Residual deviance: 1182.0 on 1822 degrees of freedom
## (206 observations deleted due to missingness)
## AIC: 1248
##
## Number of Fisher Scoring iterations: 6
Test some interaction terms based on clinical knowledge
# based on clinical knowledge, test some interaction terms
# add one new term on top of finalICU_glm per model
# then compare models with interactions with baseline model
# finalICU_glm_2 = finalICU_glm + Age:Creatinine_max
finalICU_glm_2 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
+ Age:Creatinine_max # creatinine generally increases with age
, data=icu_patients_df1, family="binomial")
# finalICU_glm_3 = finalICU_glm + Age:Temp_min
finalICU_glm_3 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
+ Age:Temp_min # low temp more often associated with illness in the elderly e.g. cold sepsis
, data=icu_patients_df1, family="binomial")
# finalICU_glm_4 = finalICU_glm + Age:Weight_min
finalICU_glm_4 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
+ Age:Weight_min # weight generally decreases with age
, data=icu_patients_df1, family="binomial")
# finalICU_glm_5 = finalICU_glm + Age:Albumin_min
finalICU_glm_5 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
Age:Albumin_min # albumin can decrease with age
, data=icu_patients_df1, family="binomial")
# finalICU_glm_6 = finalICU_glm + Gender:HCT_min
finalICU_glm_6 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
Gender:HCT_min # HCT can be lower in females than males
, data=icu_patients_df1, family="binomial")
# finalICU_glm_7 = finalICU_glm + PaO2_min:RespRate_max
finalICU_glm_7 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
PaO2_min:RespRate_max # PaO2 and resp rate are intrinsically related physiologically
, data=icu_patients_df1, family="binomial")
# finalICU_glm_8 = finalICU_glm + Age:ICUType
finalICU_glm_8 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction term
Age:ICUType # age is likely to be related to ICU type
# e.g. elderly more likely to have poor outcome after surgery requiring post-op ICU support
, data=icu_patients_df1, family="binomial")
# comparing models with anova
lapply(list(finalICU_glm_2, finalICU_glm_3, finalICU_glm_4, finalICU_glm_5,
finalICU_glm_6, finalICU_glm_7, finalICU_glm_8),
function(x) {print(anova(finalICU_glm, x, test="Chisq"))} )
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Creatinine_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1181 1 0.99667 0.3181
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Temp_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1182 1 0.033838 0.8541
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Weight_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1178.2 1 3.8386 0.05009 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Age:Albumin_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1181.1 1 0.86974 0.351
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Gender:HCT_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1180.5 1 1.512 0.2188
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## PaO2_min:RespRate_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1179 1 3.0502 0.08073 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Age:ICUType
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1819 1172.3 3 9.6718 0.02157 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [[1]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Creatinine_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1181 1 0.99667 0.3181
##
## [[2]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Temp_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1182 1 0.033838 0.8541
##
## [[3]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## +Age:Weight_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1178.2 1 3.8386 0.05009 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## [[4]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Age:Albumin_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1181.1 1 0.86974 0.351
##
## [[5]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Gender:HCT_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1821 1180.5 1 1.512 0.2188
##
## [[6]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## PaO2_min:RespRate_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182
## 2 1821 1179 1 3.0502 0.08073 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## [[7]]
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max
## Model 2: in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
## Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max + GCS_max +
## Lactate_min + Na_min + Platelets_max + Temp_min + Urine_max +
## Urine_diff + Gender + Length_of_stay + Weight_min + SOFA +
## Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Age:ICUType
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 1822 1182.0
## 2 1819 1172.3 3 9.6718 0.02157 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# the effect of weight_min varied with age
# the effect of ICUType varied with age
# borderline -- the effect of resprate_max varied with PaO2_min
## feel free to add other interactions to test
# input the significant interaction terms into a model and examine output
finalICU_glm_9 <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics
Gender + Length_of_stay + Weight_min + SOFA +
# other clinical relevance
Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
PaCO2_min + pH_min + RespRate_max + TroponinT_max +
WBC_max +
# interaction terms
Age:Weight_min + Age:ICUType # effects that had a significant effect on the model with anova
, data=icu_patients_df1, family="binomial")
summary(finalICU_glm_9)
##
## Call:
## glm(formula = in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff +
## Bilirubin_min + Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
## GCS_max + Lactate_min + Na_min + Platelets_max + Temp_min +
## Urine_max + Urine_diff + Gender + Length_of_stay + Weight_min +
## SOFA + Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max +
## Age:Weight_min + Age:ICUType, family = "binomial", data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2608 -0.5264 -0.3075 -0.1719 3.1116
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 10.4460822 6.6094070 1.580 0.113995
## Age 0.0489825 0.0227358 2.154 0.031207
## SAPS1 0.0535970 0.0244709 2.190 0.028507
## ICUTypeCardiac Surgery Recovery Unit -0.2210127 1.7105044 -0.129 0.897192
## ICUTypeMedical ICU -0.6989790 1.2630572 -0.553 0.579988
## ICUTypeSurgical ICU -2.7428109 1.3972637 -1.963 0.049648
## Albumin_diff 0.2472959 0.2044357 1.210 0.226413
## Bilirubin_min 0.2014480 0.0707616 2.847 0.004415
## Bilirubin_diff -0.1636288 0.0714581 -2.290 0.022030
## BUN_min 0.0419624 0.0081715 5.135 2.82e-07
## BUN_diff -0.0151647 0.0082462 -1.839 0.065915
## Creatinine_max -0.1903672 0.0731601 -2.602 0.009266
## GCS_max -0.1510045 0.0271798 -5.556 2.76e-08
## Lactate_min 0.0530645 0.0606197 0.875 0.381374
## Na_min -0.0507360 0.0153495 -3.305 0.000948
## Platelets_max -0.0011113 0.0007855 -1.415 0.157131
## Temp_min -0.1198013 0.0850819 -1.408 0.159110
## Urine_max -0.0037001 0.0019083 -1.939 0.052512
## Urine_diff 0.0032407 0.0019868 1.631 0.102864
## GenderMale 0.0190355 0.1646091 0.116 0.907938
## Length_of_stay -0.0070989 0.0060211 -1.179 0.238397
## Weight_min 0.0274055 0.0147932 1.853 0.063944
## SOFA 0.0213061 0.0275871 0.772 0.439924
## Albumin_min -0.1699025 0.1308453 -1.298 0.194116
## Glucose_max 0.0003778 0.0007995 0.473 0.636520
## HCT_min -0.0138659 0.0153380 -0.904 0.365982
## HR_max 0.0082142 0.0034443 2.385 0.017084
## PaO2_min 0.0005168 0.0014527 0.356 0.722037
## PaCO2_min 0.0096470 0.0097337 0.991 0.321643
## pH_min -0.4971454 0.7369185 -0.675 0.499912
## RespRate_max 0.0121383 0.0104359 1.163 0.244778
## TroponinT_max 0.0534584 0.0347094 1.540 0.123518
## WBC_max -0.0100620 0.0107421 -0.937 0.348919
## Age:Weight_min -0.0005043 0.0002219 -2.272 0.023059
## Age:ICUTypeCardiac Surgery Recovery Unit -0.0106114 0.0234023 -0.453 0.650238
## Age:ICUTypeMedical ICU 0.0105079 0.0168599 0.623 0.533123
## Age:ICUTypeSurgical ICU 0.0422493 0.0186848 2.261 0.023750
##
## (Intercept)
## Age *
## SAPS1 *
## ICUTypeCardiac Surgery Recovery Unit
## ICUTypeMedical ICU
## ICUTypeSurgical ICU *
## Albumin_diff
## Bilirubin_min **
## Bilirubin_diff *
## BUN_min ***
## BUN_diff .
## Creatinine_max **
## GCS_max ***
## Lactate_min
## Na_min ***
## Platelets_max
## Temp_min
## Urine_max .
## Urine_diff
## GenderMale
## Length_of_stay
## Weight_min .
## SOFA
## Albumin_min
## Glucose_max
## HCT_min
## HR_max *
## PaO2_min
## PaCO2_min
## pH_min
## RespRate_max
## TroponinT_max
## WBC_max
## Age:Weight_min *
## Age:ICUTypeCardiac Surgery Recovery Unit
## Age:ICUTypeMedical ICU
## Age:ICUTypeSurgical ICU *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1549.9 on 1854 degrees of freedom
## Residual deviance: 1167.4 on 1818 degrees of freedom
## (206 observations deleted due to missingness)
## AIC: 1241.4
##
## Number of Fisher Scoring iterations: 6
# the AIC is slightly lower than finalICU_glm
## the effects of the interactions have significance but the ORs are close to 1 with narrow CIs -- perhaps not very clinically informative (i.e. the odds are essentially 1 i.e. equal)
options(scipen=999)
exp(coef(finalICU_glm_9))
## (Intercept)
## 34409.3024625
## Age
## 1.0502020
## SAPS1
## 1.0550593
## ICUTypeCardiac Surgery Recovery Unit
## 0.8017065
## ICUTypeMedical ICU
## 0.4970926
## ICUTypeSurgical ICU
## 0.0643891
## Albumin_diff
## 1.2805580
## Bilirubin_min
## 1.2231726
## Bilirubin_diff
## 0.8490571
## BUN_min
## 1.0428553
## BUN_diff
## 0.9849497
## Creatinine_max
## 0.8266555
## GCS_max
## 0.8598438
## Lactate_min
## 1.0544977
## Na_min
## 0.9505295
## Platelets_max
## 0.9988893
## Temp_min
## 0.8870967
## Urine_max
## 0.9963068
## Urine_diff
## 1.0032460
## GenderMale
## 1.0192178
## Length_of_stay
## 0.9929262
## Weight_min
## 1.0277845
## SOFA
## 1.0215347
## Albumin_min
## 0.8437471
## Glucose_max
## 1.0003779
## HCT_min
## 0.9862298
## HR_max
## 1.0082480
## PaO2_min
## 1.0005169
## PaCO2_min
## 1.0096936
## pH_min
## 0.6082645
## RespRate_max
## 1.0122123
## TroponinT_max
## 1.0549131
## WBC_max
## 0.9899884
## Age:Weight_min
## 0.9994958
## Age:ICUTypeCardiac Surgery Recovery Unit
## 0.9894447
## Age:ICUTypeMedical ICU
## 1.0105633
## Age:ICUTypeSurgical ICU
## 1.0431545
exp(confint(finalICU_glm_9))
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) 0.349264961 55616396939.1895981
## Age 1.004363391 1.0982223
## SAPS1 1.005646548 1.1069781
## ICUTypeCardiac Surgery Recovery Unit 0.026475610 22.7506774
## ICUTypeMedical ICU 0.045628604 6.6130489
## ICUTypeSurgical ICU 0.004335545 1.0705690
## Albumin_diff 0.855208579 1.9074646
## Bilirubin_min 1.068091445 1.4094166
## Bilirubin_diff 0.735184142 0.9726065
## BUN_min 1.026518641 1.0599608
## BUN_diff 0.968983906 1.0008796
## Creatinine_max 0.710256544 0.9464742
## GCS_max 0.814926434 0.9066580
## Lactate_min 0.936460628 1.1861611
## Na_min 0.922266474 0.9795813
## Platelets_max 0.997318447 1.0003969
## Temp_min 0.749498343 1.0457701
## Urine_max 0.992581920 1.0000434
## Urine_diff 0.999339162 1.0071610
## GenderMale 0.738576106 1.4089332
## Length_of_stay 0.980684397 1.0041664
## Weight_min 0.997344469 1.0570192
## SOFA 0.967768306 1.0784256
## Albumin_min 0.651585452 1.0891345
## Glucose_max 0.998766089 1.0019205
## HCT_min 0.956821880 1.0161779
## HR_max 1.001440586 1.0150877
## PaO2_min 0.997618803 1.0033342
## PaCO2_min 0.990309329 1.0288868
## pH_min 0.109787108 1.7759592
## RespRate_max 0.991427049 1.0328887
## TroponinT_max 0.983467416 1.1282291
## WBC_max 0.968562232 1.0099461
## Age:Weight_min 0.999069973 0.9999408
## Age:ICUTypeCardiac Surgery Recovery Unit 0.945156830 1.0364600
## Age:ICUTypeMedical ICU 0.976654945 1.0436592
## Age:ICUTypeSurgical ICU 1.005000871 1.0817098
Testing the modified poisson regression, as the outcome is 14% in this data (>10% - common)
# test using modified poisson regression for more common outcomes on the same covariates as above
finalICU_glm_poisson <- glm(in_hospital_death ~
# significant predictors from step()
Age + SAPS1 + ICUType + Albumin_diff + Bilirubin_min +
Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
GCS_max + Lactate_min + Na_min + Platelets_max +
Temp_min + Urine_max + Urine_diff +
# baseline demographics should be included even if not significant
Gender + Length_of_stay + Weight_min +
SOFA + # an indicator of how well SOFA score determines mortality independent to its components
# other clinical relevance
Albumin_min + # low albumin indicates malnutrition or liver failure
Glucose_max + # hyperglycaemia is a stress response
HCT_min + # low HCT = anaemia
HR_max + # tachycardia may indicate septic shock / inflammation
PaO2_min + # hypoxia = inadequate organ perfusion/oxygenation
PaCO2_min + #hypercapnia = respiratory / ventilation failure
pH_min + # indicates acidaemia / inadequate organ perfusion
RespRate_max + # indicates respiratory failure
TroponinT_max + # indicates myocardial damage
WBC_max # indicates infection
, data=icu_patients_df1, family="poisson"(link="log"))
summary(finalICU_glm_poisson)
##
## Call:
## glm(formula = in_hospital_death ~ Age + SAPS1 + ICUType + Albumin_diff +
## Bilirubin_min + Bilirubin_diff + BUN_min + BUN_diff + Creatinine_max +
## GCS_max + Lactate_min + Na_min + Platelets_max + Temp_min +
## Urine_max + Urine_diff + Gender + Length_of_stay + Weight_min +
## SOFA + Albumin_min + Glucose_max + HCT_min + HR_max + PaO2_min +
## PaCO2_min + pH_min + RespRate_max + TroponinT_max + WBC_max,
## family = poisson(link = "log"), data = icu_patients_df1)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8744 -0.5069 -0.3491 -0.2183 2.4834
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.0214843 3.3904946 1.481 0.13859
## Age 0.0198792 0.0049400 4.024 0.00005718
## SAPS1 0.0339633 0.0204016 1.665 0.09596
## ICUTypeCardiac Surgery Recovery Unit -0.7603505 0.2813563 -2.702 0.00688
## ICUTypeMedical ICU 0.0629493 0.1935242 0.325 0.74497
## ICUTypeSurgical ICU 0.1931383 0.2185759 0.884 0.37690
## Albumin_diff 0.1457395 0.1641374 0.888 0.37459
## Bilirubin_min 0.1464095 0.0568473 2.575 0.01001
## Bilirubin_diff -0.1263382 0.0579171 -2.181 0.02916
## BUN_min 0.0292204 0.0068460 4.268 0.00001970
## BUN_diff -0.0157550 0.0068983 -2.284 0.02238
## Creatinine_max -0.0890083 0.0548536 -1.623 0.10466
## GCS_max -0.0990101 0.0213334 -4.641 0.00000347
## Lactate_min -0.0110473 0.0333359 -0.331 0.74035
## Na_min -0.0352725 0.0121226 -2.910 0.00362
## Platelets_max -0.0006707 0.0006484 -1.034 0.30096
## Temp_min -0.0647282 0.0531798 -1.217 0.22354
## Urine_max -0.0015792 0.0014410 -1.096 0.27312
## Urine_diff 0.0011907 0.0015125 0.787 0.43113
## GenderMale -0.0834178 0.1320252 -0.632 0.52750
## Length_of_stay -0.0042229 0.0048572 -0.869 0.38462
## Weight_min -0.0037488 0.0033072 -1.134 0.25700
## SOFA 0.0144047 0.0215427 0.669 0.50371
## Albumin_min -0.1336429 0.1048927 -1.274 0.20263
## Glucose_max 0.0001538 0.0006267 0.245 0.80608
## HCT_min -0.0108434 0.0126736 -0.856 0.39222
## HR_max 0.0052571 0.0026488 1.985 0.04718
## PaO2_min 0.0004131 0.0011102 0.372 0.70981
## PaCO2_min 0.0059266 0.0079216 0.748 0.45437
## pH_min -0.0847659 0.2631944 -0.322 0.74740
## RespRate_max 0.0095491 0.0085073 1.122 0.26167
## TroponinT_max 0.0334174 0.0247510 1.350 0.17697
## WBC_max -0.0071252 0.0086893 -0.820 0.41222
##
## (Intercept)
## Age ***
## SAPS1 .
## ICUTypeCardiac Surgery Recovery Unit **
## ICUTypeMedical ICU
## ICUTypeSurgical ICU
## Albumin_diff
## Bilirubin_min *
## Bilirubin_diff *
## BUN_min ***
## BUN_diff *
## Creatinine_max
## GCS_max ***
## Lactate_min
## Na_min **
## Platelets_max
## Temp_min
## Urine_max
## Urine_diff
## GenderMale
## Length_of_stay
## Weight_min
## SOFA
## Albumin_min
## Glucose_max
## HCT_min
## HR_max *
## PaO2_min
## PaCO2_min
## pH_min
## RespRate_max
## TroponinT_max
## WBC_max
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1046.23 on 1854 degrees of freedom
## Residual deviance: 769.47 on 1822 degrees of freedom
## (206 observations deleted due to missingness)
## AIC: 1381.5
##
## Number of Fisher Scoring iterations: 6
# fewer significant variables (likely as CI can be wider in poisson)
# but the variables that are significant were also significant in the logistic model
# examine ORs from logistic regression
options(scipen=999) # turn off scientific notation
exp(coef(finalICU_glm))
## (Intercept) Age
## 82436.7716848 1.0264465
## SAPS1 ICUTypeCardiac Surgery Recovery Unit
## 1.0549861 0.4005375
## ICUTypeMedical ICU ICUTypeSurgical ICU
## 1.1422928 1.3593441
## Albumin_diff Bilirubin_min
## 1.2807264 1.2298952
## Bilirubin_diff BUN_min
## 0.8431225 1.0416122
## BUN_diff Creatinine_max
## 0.9850044 0.8362954
## GCS_max Lactate_min
## 0.8609018 1.0581252
## Na_min Platelets_max
## 0.9517379 0.9988901
## Temp_min Urine_max
## 0.8943036 0.9962649
## Urine_diff GenderMale
## 1.0032541 0.9496095
## Length_of_stay Weight_min
## 0.9924128 0.9950665
## SOFA Albumin_min
## 1.0199953 0.8452797
## Glucose_max HCT_min
## 1.0002854 0.9917670
## HR_max PaO2_min
## 1.0077796 1.0006984
## PaCO2_min pH_min
## 1.0078591 0.6099058
## RespRate_max TroponinT_max
## 1.0115309 1.0548663
## WBC_max
## 0.9909907
# examine RRs from logistic regression
exp(coef(finalICU_glm_poisson))
## (Intercept) Age
## 151.6362129 1.0200782
## SAPS1 ICUTypeCardiac Surgery Recovery Unit
## 1.0345467 0.4675026
## ICUTypeMedical ICU ICUTypeSurgical ICU
## 1.0649728 1.2130505
## Albumin_diff Bilirubin_min
## 1.1568947 1.1576701
## Bilirubin_diff BUN_min
## 0.8813167 1.0296515
## BUN_diff Creatinine_max
## 0.9843684 0.9148380
## GCS_max Lactate_min
## 0.9057335 0.9890134
## Na_min Platelets_max
## 0.9653423 0.9993295
## Temp_min Urine_max
## 0.9373222 0.9984220
## Urine_diff GenderMale
## 1.0011914 0.9199667
## Length_of_stay Weight_min
## 0.9957860 0.9962582
## SOFA Albumin_min
## 1.0145089 0.8749024
## Glucose_max HCT_min
## 1.0001539 0.9892151
## HR_max PaO2_min
## 1.0052709 1.0004132
## PaCO2_min pH_min
## 1.0059442 0.9187274
## RespRate_max TroponinT_max
## 1.0095948 1.0339820
## WBC_max
## 0.9929001
# the ORs and RRs appear very similar --> check the actual differences
exp(coef(finalICU_glm))-exp(coef(finalICU_glm_poisson))
## (Intercept) Age
## 82285.1354719059 0.0063683588
## SAPS1 ICUTypeCardiac Surgery Recovery Unit
## 0.0204394406 -0.0669650446
## ICUTypeMedical ICU ICUTypeSurgical ICU
## 0.0773199959 0.1462936040
## Albumin_diff Bilirubin_min
## 0.1238316698 0.0722251056
## Bilirubin_diff BUN_min
## -0.0381942636 0.0119607014
## BUN_diff Creatinine_max
## 0.0006359748 -0.0785426101
## GCS_max Lactate_min
## -0.0448317494 0.0691117953
## Na_min Platelets_max
## -0.0136044129 -0.0004393842
## Temp_min Urine_max
## -0.0430185399 -0.0021571705
## Urine_diff GenderMale
## 0.0020626400 0.0296427916
## Length_of_stay Weight_min
## -0.0033731879 -0.0011917417
## SOFA Albumin_min
## 0.0054863813 -0.0296227050
## Glucose_max HCT_min
## 0.0001315695 0.0025518576
## HR_max PaO2_min
## 0.0025087077 0.0002851903
## PaCO2_min pH_min
## 0.0019148808 -0.3088215213
## RespRate_max TroponinT_max
## 0.0019360867 0.0208842520
## WBC_max
## -0.0019094248
# the intercept is very different (by 82000!) - not sure how to interpret that. the other estimates are very similar
# perhaps the logistic model is therefore justified? just need to be careful in interpretation using 'odds' rather than 'risk'
library(magrittr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# lots of missing data in:
# Survival, ABP, NIBP variables
# some missing data in SAPS1 and Weight - correlates with GLM full model missing data
for(i in 1:length(colnames(icu_patients_df1))){
print(c(i,colnames(icu_patients_df1[i]), sum(is.na(icu_patients_df1[i]))))
}
## [1] "1" "RecordID" "0"
## [1] "2" "Length_of_stay" "0"
## [1] "3" "SAPS1" "96"
## [1] "4" "SOFA" "0"
## [1] "5" "Survival" "1288"
## [1] "6" "in_hospital_death" "0"
## [1] "7" "Days" "0"
## [1] "8" "Status" "0"
## [1] "9" "Age" "0"
## [1] "10" "Albumin_diff" "0"
## [1] "11" "Albumin_max" "0"
## [1] "12" "Albumin_min" "0"
## [1] "13" "ALP_diff" "0"
## [1] "14" "ALP_max" "0"
## [1] "15" "ALP_min" "0"
## [1] "16" "ALT_diff" "0"
## [1] "17" "ALT_max" "0"
## [1] "18" "ALT_min" "0"
## [1] "19" "AST_diff" "0"
## [1] "20" "AST_max" "0"
## [1] "21" "AST_min" "0"
## [1] "22" "Bilirubin_diff" "0"
## [1] "23" "Bilirubin_max" "0"
## [1] "24" "Bilirubin_min" "0"
## [1] "25" "BUN_diff" "0"
## [1] "26" "BUN_max" "0"
## [1] "27" "BUN_min" "0"
## [1] "28" "Cholesterol_diff" "0"
## [1] "29" "Cholesterol_max" "0"
## [1] "30" "Cholesterol_min" "0"
## [1] "31" "Creatinine_diff" "0"
## [1] "32" "Creatinine_max" "0"
## [1] "33" "Creatinine_min" "0"
## [1] "34" "DiasABP_diff" "715"
## [1] "35" "DiasABP_max" "715"
## [1] "36" "DiasABP_min" "715"
## [1] "37" "FiO2_diff" "0"
## [1] "38" "FiO2_max" "0"
## [1] "39" "FiO2_min" "0"
## [1] "40" "GCS_diff" "0"
## [1] "41" "GCS_max" "0"
## [1] "42" "GCS_min" "0"
## [1] "43" "Gender" "0"
## [1] "44" "Glucose_diff" "0"
## [1] "45" "Glucose_max" "0"
## [1] "46" "Glucose_min" "0"
## [1] "47" "HCO3_diff" "0"
## [1] "48" "HCO3_max" "0"
## [1] "49" "HCO3_min" "0"
## [1] "50" "HCT_diff" "0"
## [1] "51" "HCT_max" "0"
## [1] "52" "HCT_min" "0"
## [1] "53" "Height" "992"
## [1] "54" "HR_diff" "0"
## [1] "55" "HR_max" "0"
## [1] "56" "HR_min" "0"
## [1] "57" "ICUType" "0"
## [1] "58" "K_diff" "0"
## [1] "59" "K_max" "0"
## [1] "60" "K_min" "0"
## [1] "61" "Lactate_diff" "0"
## [1] "62" "Lactate_max" "0"
## [1] "63" "Lactate_min" "0"
## [1] "64" "MAP_diff" "0"
## [1] "65" "MAP_max" "0"
## [1] "66" "MAP_min" "0"
## [1] "67" "Mg_diff" "0"
## [1] "68" "Mg_max" "0"
## [1] "69" "Mg_min" "0"
## [1] "70" "Na_diff" "0"
## [1] "71" "Na_max" "0"
## [1] "72" "Na_min" "0"
## [1] "73" "NIDiasABP_diff" "455"
## [1] "74" "NIDiasABP_max" "455"
## [1] "75" "NIDiasABP_min" "455"
## [1] "76" "NIMAP_diff" "455"
## [1] "77" "NIMAP_max" "455"
## [1] "78" "NIMAP_min" "455"
## [1] "79" "NISysABP_diff" "453"
## [1] "80" "NISysABP_max" "453"
## [1] "81" "NISysABP_min" "453"
## [1] "82" "PaCO2_diff" "0"
## [1] "83" "PaCO2_max" "0"
## [1] "84" "PaCO2_min" "0"
## [1] "85" "PaO2_diff" "0"
## [1] "86" "PaO2_max" "0"
## [1] "87" "PaO2_min" "0"
## [1] "88" "pH_diff" "0"
## [1] "89" "pH_max" "0"
## [1] "90" "pH_min" "0"
## [1] "91" "Platelets_diff" "0"
## [1] "92" "Platelets_max" "0"
## [1] "93" "Platelets_min" "0"
## [1] "94" "RespRate_diff" "0"
## [1] "95" "RespRate_max" "0"
## [1] "96" "RespRate_min" "0"
## [1] "97" "SaO2_diff" "0"
## [1] "98" "SaO2_max" "0"
## [1] "99" "SaO2_min" "0"
## [1] "100" "SysABP_diff" "715"
## [1] "101" "SysABP_max" "715"
## [1] "102" "SysABP_min" "715"
## [1] "103" "Temp_diff" "0"
## [1] "104" "Temp_max" "0"
## [1] "105" "Temp_min" "0"
## [1] "106" "TroponinI_diff" "0"
## [1] "107" "TroponinI_max" "0"
## [1] "108" "TroponinI_min" "0"
## [1] "109" "TroponinT_diff" "0"
## [1] "110" "TroponinT_max" "0"
## [1] "111" "TroponinT_min" "0"
## [1] "112" "Urine_diff" "0"
## [1] "113" "Urine_max" "0"
## [1] "114" "Urine_min" "0"
## [1] "115" "WBC_diff" "0"
## [1] "116" "WBC_max" "0"
## [1] "117" "WBC_min" "0"
## [1] "118" "Weight_diff" "146"
## [1] "119" "Weight_max" "146"
## [1] "120" "Weight_min" "146"
# remove observations with missing values from the data frame,
# because they are automatically dropped by glm()
# remove the survival, ABP, some of the BP, height columns first
icu_patients_df1_nm <- icu_patients_df1[, -c(5,34:36,53,73:81, 100:102)]
icu_patients_df1_nm <- na.omit(icu_patients_df1_nm)
### Goodness of fit using bins ###
# add predicted probabilities to the data frame
icu_patients_df1_nm %>% mutate(predprob=predict(finalICU_glm, type="response"),
linpred=predict(finalICU_glm)) %>%
# group the data into bins based on the linear predictor fitted values
group_by(cut(linpred, breaks=unique(quantile(linpred, (1:50)/51)))) %>%
# summarise by bin
summarise(death_bin=sum(in_hospital_death), predprob_bin=mean(predprob), n_bin=n()) %>%
# add the standard error of the mean predicted probaility for each bin
mutate(se_predprob_bin=sqrt(predprob_bin*(1 - predprob_bin)/n_bin)) %>%
# plot it with 95% confidence interval bars
ggplot(aes(x=predprob_bin,
y=death_bin/n_bin,
ymin=death_bin/n_bin - 1.96*se_predprob_bin,
ymax=death_bin/n_bin + 1.96*se_predprob_bin)) +
geom_point() + geom_linerange(colour="orange", alpha=0.4) +
geom_abline(intercept=0, slope=1) +
labs(x="Predicted probability (binned)",
y="Observed proportion (in each bin)")
# the ideal calibration line fits within most of the dots and their 95% CI
### Goodness of fit using Hosmer Lemeshow stat ###
icu_patients_df1_nm %>% mutate(predprob=predict(finalICU_glm, type="response"),
linpred=predict(finalICU_glm)) %>%
group_by(cut(linpred, breaks=unique(quantile(linpred, (1:50)/51)))) %>%
summarise(death_bin=sum(in_hospital_death), predprob_bin=mean(predprob), n_bin=n()) %>%
mutate(se_predprob_bin=sqrt(predprob_bin*(1 - predprob_bin)/n_bin)) -> hl_df
hl_stat <- with(hl_df, sum( (death_bin - n_bin*predprob_bin)^2 /
(n_bin* predprob_bin*(1 - predprob_bin))))
hl <- c(hosmer_lemeshow_stat=hl_stat, hl_degrees_freedom=nrow(hl_df) - 1)
hl
## hosmer_lemeshow_stat hl_degrees_freedom
## 48.22866 49.00000
# calculate p-value
c(p_val=1 - pchisq(hl[1], hl[2])) # the p value here is not statistically significant, indicating no lack of fit
## p_val.hosmer_lemeshow_stat
## 0.5043216
### Brier score ###
get_brier <- function(model){
predprob <- predict(model, type="response")
Brier_score <- mean((predprob - icu_patients_df1_nm$in_hospital_death)^2)
return(Brier_score)
}
get_brier(finalICU_glm)
## [1] 0.09699079
get_brier(minmaxdiffICU_glm)
## Warning in predprob - icu_patients_df1_nm$in_hospital_death: longer object
## length is not a multiple of shorter object length
## [1] 0.1540638
get_brier(step_minmaxdiffICU_glm)
## Warning in predprob - icu_patients_df1_nm$in_hospital_death: longer object
## length is not a multiple of shorter object length
## [1] 0.1538637
# the final model has the lowest brier score -> lower score is better fit
Create your response to this task here, as a mixture of embedded (knitr) R code and any resulting outputs, and explanatory or commentary text.
In this task, you are required to develop a Cox proportional hazards survival model using the icu_patients_df1 data set which adequately explains or predicts the length of survival indicated by the Days variable, with censoring as indicated by the Status variable. You should fit a series of models, maybe three or four, evaluating each one, before you present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge. Aim for between five and ten predictor variables (slightly more or fewer is OK). It is perfectly acceptable to include predictor variables in your final model which are not statistically significant, as long as you justify their inclusion on medical or physiological grounds (you will not be marked down if your medical justification is not exactly correct, but do you best). You should assess each model you consider for goodness of fit and other relevant statistics, and you should assess your final model for violations of assumptions and perform other diagnostics which you think are relevant (and modify the model if indicated, or at least comment on the possible impact of what your diagnostics show). Finally, re-fit your final model to the unimputed data frame (icu_patients_df0.rds) and comment on any differences you find.
Select an initial subset of explanatory variables that you will use to predict survival. Justify your choice.
Conduct basic exploratory data analysis on your variables of choice.
Fit appropriate univariate Cox proportional hazards models.
Fit an appropriate series of multivariable Cox proportional hazards models, justifying your approach. Assess each model you consider for goodness of fit and other relevant statistics.
Present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge.
For your final model, present a set of diagnostic statistics and/or charts and comment on them.
Write a very brief paragraph summarising the most important findings of your final model. Include the most important values from the statistical output, and a simple clinical interpretation.
Create your response to this task here, as a mixture of embedded (knitr) R code and any resulting outputs, and explanatory or commentary text.
Reminder: don’t forget to save this file, to knit it to check that everything works, and then submit via the drop box in OpenLearning.
When you have finished, and are satisfied with your assignment solutions, and this file knits without errors and the output looks the way you want, then you should submit via the drop box in OpenLearning.
If you encounter problems with any part of the process described above, please contact the course convenor via OpenLearning as soon as possible so that the issues can be resolved in good time, and well before the assignment is due.
Each task attracts the indicated number of marks (out of a total of 30 marks for the assignment). The instructions are deliberately open-ended and less prescriptive than the individual assignments to allow you some latitude in what you do and how you go about the task. However, to complete the tasks and gain full marks, you only need to replicate or repeat the steps covered in the course - if you do most or all of the things described in the revalant chapters of the HDAT9600 course, full marks will be awarded.
Note also that with respect to the model fitting, there are no right or wrong answers when it comes to variable selection and other aspects of model specification. Deep understanding of the underlying medical concepts which govern patient treatment and outcomes in ICUs is not required or assumed, although you should try to gain some understanding of each variable using the links provided. You will not be marked down if your medical justifications are not exactly correct or complete, but do you best, and don’t hesitate to seek help from the course convenor.